How to index data in a specific shard using solrj

六眼飞鱼酱① 提交于 2020-01-23 13:35:13

问题


I am using solrj as client to index documents into solr cloud (Using solr4.5)

I had a requirement to save documents based on tenant_id, so i am trying to do document routing. Which is possible only if the collection is created using numShards parameter (http://searchhub.org/2013/06/13/solr-cloud-document-routing/)

I have two instances of solr in solr cloud(example1/solr and example2/solr) and exrenal zookeeper which is running in 2181 port.

Both the instances consist collection called collection1

I created one more collection called newCollection(With two shards and two replicas) using http://localhost:8501/solr/admin/collectionsaction=CREATE&name=newCollection&numShards=2&replicationFactor=2&maxShardsPerNode=2&router.field=id

So in example1/solr-> I have newCollection_shard1_replica1 & newCollection_shard2_replica1,

In example2/solr -> I have newCollection_shard1_replica2 & newCollection_shard2_replica2

I copied example1/solr/collection1/conf to all shards and replicas

I restarted zookeeper server as well as solr instances:

zookeeper->zkServer.cmd

example1/solr-> java -Dbootstrap_confdir=./solr/newCollection_shard1_replica1/conf -Dcollection.configName=myconf -DzkHost=localhost:2181 -jar start.jar

example2/solr->java -DzkHost=localhost:2181 -jar start.jar

(Both instances are running at different port, one is at 8081 and other at 8051)


I am using solrj client to index documents

Here is my sample code

String url="http://localhost:8081/solr"
ConcurrentUpdateSolrServer solrServer= new ConcurrentUpdateSolrServer(url, 10000, 4);
SolrInputDocument doc = new SolrInputDocument();
doc.addField("id", "shard1!513");
doc.addField("name", "Santhosh");
solrServer.add(documents);
solrServer.commit();

But it is saving document in collection1 with id shard1!513, is there any configuration changes required in solrconfig.xml (I am using default solrconfig.xml which came with solr4.5)

How to save documents in my newCollection? and how to do document routing?

Please help me out with issue.

Thanks!


回答1:


You can Use CloudSolrServer and UpdateRequest

SolrServer solrServer = new CloudSolrServer(zkHost) // zkHost is your solr zookeeper host string
SolrInputDocument doc = new SolrInputDocument();
UpdateRequest add = new UpdateRequest();
add.add(document);
add.setParam("collection", "newCollection");
add.process(solrServer);

UpdateRequest commit = new UpdateRequest();
commit.setAction(UpdateRequest.ACTION.COMMIT, true, true);
commit.setParam("collection", "newCollection");
commit.process(solrServer);



回答2:


I appended Core name of new Collection to the URL. so it is working fine now.

Instead of:

String url="http://localhost:8081/solr"

I used:

String url="http://localhost:8081/solr/newCollection_shard1_replica1"
ConcurrentUpdateSolrServer solrServer= new ConcurrentUpdateSolrServer(url, 10000, 4);
SolrInputDocument doc = new SolrInputDocument();
doc.addField("id", "shard1!513");
doc.addField("name", "Santhosh");
solrServer.add(documents);
solrServer.commit();



回答3:


You should use CloudSolrServer http://lucene.apache.org/solr/4_2_1/solr-solrj/org/apache/solr/client/solrj/impl/CloudSolrServer.html

Because in solrcloud, updates must be routed via zookeeper, as zookeeper knows the status of leaders in cloud.One more thing you need not to append collection name to url, just use setDefaultCollection(collectionName); method of CloudSolrServer to send your updates to 'collectionName' collection



来源:https://stackoverflow.com/questions/19479081/how-to-index-data-in-a-specific-shard-using-solrj

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!