Solr - Tips and Tricks


Admin UI
http://127.0.0.1:8983/solr/#/~cloud?view=tree
bin/solr help
bin/solr status
bin/solr healthcheck
bin/solr stop -all
(bin/solr start -cloud -s example/cloud/node1/solr -p 8983 -h 127.0.0.1)  && (bin/solr start -cloud -s example/cloud/node2/solr -p 7574 -z 127.0.0.1:9983 -h 127.0.0.1) && (bin/solr start -cloud -s example/cloud/node3/solr -p 6463 -z 127.0.0.1:9983 -h 127.0.0.1)

Delete docs:
change *;* to your query
update?commit=true&stream.body=<delete><query>*:*</query></delete>

https://wiki.apache.org/solr/SearchHandler
Use invariants to lock options and overwrite values client passes.
Use appends to append options, use defaults to provide default options.

Request Paramters
distrib=false - only query current core

debugQuery
debug=query/results/timing

explainOther
debug=results&explainOther=id:MA*

Range Query inclusive: [a to b]
exclusive: {a to b} - it's not ().
mixed: [a to b} {a to b]

Negative Query
Query empty fields: -field:*
field is empty or is abc: (*:* OR -field:*) OR field:abc
(*:* -id:1) OR id:1 - return all docs
http://stackoverflow.com/questions/634765/using-or-and-not-in-solr-query
-foo is transformed by solr into (*:* -foo)
The big caveat is that Solr only checks to see if the top level query is a pure negative query!

Solr zkcli
zkcli.sh -zkhost zooServer:port  -cmd putfile /configs/solrconfig.xml solrconfig.xml
zkcli.sh -zkhost zooServer:port  -cmd get /configs/schema.xml

To get solrcloud nodes info(such as ip address)
java -classpath "*" org.apache.solr.cloud.ZkCLI -zkhost myzkhost -cmd get /clusterstate.json | grep base_url
zkcli.sh -zkhost myzkhost:port -cmd get /clusterstate.json

Rest API
solr/collection/config
solr/collection/config/requestHandler
solr/collection/schema
solr/collection/schema/version
solr/admin/collections?action=RELOAD&name=$NAME

SolrJ Field Annotation
Map Dynamic fields to fields
@Field("supplier_*")
Map> supplier;

@Field("sup_simple_*")
Map supplier_simple;

@Field("allsupplier_*")
private String[] allSuppliers;

@Field(child = true)
Child[] child;

Staring Solr
-m 2g: Start Solr with the defined value as the min (-Xms) and max (-Xmx) heap size for the JVM.

bin/solr stop -all

(bin/solr start -cloud -s example/cloud/node1/solr -p 8983 -h 127.0.0.1 -m 2g)  && (bin/solr start -cloud -s example/cloud/node2/solr -p 7574 -z 127.0.0.1:9983 -h 127.0.0.1 -m 2g) && (bin/solr start -cloud -s example/cloud/node3/solr -p 6463 -z 127.0.0.1:9983 -h 127.0.0.1 -m 2g)
-- Use -h 127.0.0.1 so solr can continue to work even ip changed.

Extending Solr
Implement the SolrCoreAware interface in custom RequestHandler to get SolrCore in inform method.

Customize and extend DocumentObjectBinder

Get solr static fields
SolrServer solrCore = new HttpSolrServer("http://{host:port}/solr/core-name");
SolrQuery query = new SolrQuery();

query.setRequestHandler("/schema/fields");
// query.add(CommonParams.QT, "/schema/fields");
QueryResponse response = solrClient.query(query);
NamedList responseHeader = response.getResponseHeader();
ArrayList fields = (ArrayList) response.getResponse().get("fields");
for (SimpleOrderedMap field : fields) {
    Object fieldName = field.get("name");

}

Solr Internals
replicationFactor

The Solr replicationFactor has nothing to do with quorum. Solr uses Zookeeper's Quorum sensing to insure that all Solr nodes have a consistent picture of the cluster.

openSearcher and hardCommit
- Soft commit always opens new searcher.
- openSearcher only makes sense for hardcommit

Use config api to change solr settings dynamically

Use JSON API, but be aware SolrJ may not work with JSON API in some cases.

Solr Facet APIs
http://yonik.com/json-facet-api/
http://yonik.com/solr-facet-functions/

Don't forget facet.mincount=1

update.distrib
=toLeader when one replica sends this doc to it's leader=
=fromLeader when the doc's leader sends it to its followers

Solr Nested Objects
Define _root_ field

Use [child] - ChildDocTransformerFactory to return child documents

admin collection apis
/admin/collections?action=CLUSTERSTATUS

curl http://localhost:8983/solr/mycollection/update -X POST -H 'Content-Type: application/json' --data-binary @atomic.json

Zookeeper
Clean ZK data - link
run java -cp zookeeper-3.4.6.jar:conf org.apache.zookeeper.server.PurgeTxnLog  ../zoo_data/ ../zoo_data/ -n 3

Access solr cloud via ssh tunnel
Create a tunnel to zookeeper and solr nodes
- But when solrJ queries zookeeper, it still returns the external solr nodes that we can't access directly
Add a conditional breakpoint at CloudSolrClient.sendRequest(SolrRequest, String)
- before  LBHttpSolrClient.Req req = new LBHttpSolrClient.Req(request, theUrlList);
theUrlList.clear();
theUrlList.add("http://localhost:18983/solr/searchItems/");
theUrlList.add("http://localhost:28983/solr/searchItems/");

return false;

Solr suggester
It supports filter on multiple fields. Just copy these fields to the contextFilterFeild.

Troubleshooting
400 Unknown Version - when run curl solr
- Maybe u need encode query parameters

Debug Solr Query
http://splainer.io/

APIs
http://localhost:8983/solr/admin/collections
?action=LIST&wt=json

solr/admin/collections?action=OVERSEERSTATUS
overseer_queue_size
overseer_work_queue_size

admin/mbeans?key=fieldCache&stats=true

Coding
Watches are one time triggers

SolrJ
*SolrClient
request(SolrRequest)
.getZkStateReader()

GenericSolrRequest 
solrClient.request(new GenericSolrRequest(SolrRequest.METHOD.GET, "/admin/mbeans", params))
(NamedList<Object>) nl.findRecursive("solr-mbeans", "CACHE", "fieldCache", "stats");</Object>

Use Zookeeper client
./zkCli.sh -server localhost:9983
ls(delete) path
create path data // data can be ''
get path
stat /overseer/collection-queue-work
get /overseer/collection-queue-work/qn-0001379031
Create chroot path
create /the-chroot-path []

Labels

adsense (5) Algorithm (69) Algorithm Series (35) Android (7) ANT (6) bat (8) Big Data (7) Blogger (14) Bugs (6) Cache (5) Chrome (19) Code Example (29) Code Quality (7) Coding Skills (5) Database (7) Debug (16) Design (5) Dev Tips (63) Eclipse (32) Git (5) Google (33) Guava (7) How to (9) Http Client (8) IDE (7) Interview (88) J2EE (13) J2SE (49) Java (186) JavaScript (27) JSON (7) Learning code (9) Lesson Learned (6) Linux (26) Lucene-Solr (112) Mac (10) Maven (8) Network (9) Nutch2 (18) Performance (9) PowerShell (11) Problem Solving (11) Programmer Skills (6) regex (5) Scala (6) Security (9) Soft Skills (38) Spring (22) System Design (11) Testing (7) Text Mining (14) Tips (17) Tools (24) Troubleshooting (29) UIMA (9) Web Development (19) Windows (21) xml (5)