How to add a node to SolrCloud dynamically without SPLITSHARD?

好久不见. 提交于 2019-12-21 05:48:08

问题


I have setup SolrCloud with 4 shards. I added 8 nodes to the SolrCloud(4 Leaders and 4 Replicas). Each node is running in different machine. But later I identified that my data is growing more and more(daily 4 million files) so that my 4 shards are not sufficient. So, I want to add one more shard to this SolrCloud dynamically. When I add a new node that is created as replica,that is not what I want. When I search for this in Google, the answers I got is use Collection API SPLITSHARD. If I use SPLITSHARD that will split the already existed shard. But here my requirement is to add new shard to this SolrCloud. How to do this?

Any suggestion will be appreciated. Thanks in advance.


回答1:


The answer is buried in the SolrCloud docs. See https://cwiki.apache.org/confluence/display/solr/Nodes,+Cores,+Clusters+and+Leaders the section 'Resizing a Cluster'

Basically the process is

  1. Split a Shard - now you will have two shards on that one machine
  2. Setup a replica of this new shard on your new machine
  3. Remove the new shard from the original machine. ZooKeeper will promote the replica to the leader for that shard.
  4. Setup a replica for that new shard

Very kludgy and manual process. SolrCloud isn't very "Cloudy" i.e. elastic.




回答2:


When you create the collection at the first time you make a very important decision, which is the sharding technique. Solr provides two different ways, implicit, or compositeId.

if you set it to compositeId, this means you want solr to calculate the shard based on a field of your choice (or the id by default), Solr will calculate a 32-bit integer hash key based on that field, and allocate a range for each shard. You also need to specify the number of shards in advance. So, solr will allocate a range of the 32-bit integer values for each shard, and according to the hash value it will route the document to the proper shard. For example if you set it to 4 shards, and the hash key happens to be in the first quarter of the 32-bit range, then it goes to first shard, and so on...

With this way you cannot change the number of shards later on. Because that will break the whole structure, you can still split one range into two separate sub-ranges. But you cannot just extend existing structure.

Second way, which is implicit, you don't have to specify the number of shards in advance, but you do the sharding manually in your application, and provide a field that has the name of the shard so, solr can route the document directly without calculating any thing. In this way, you can create as many shards in the future without affecting existing shards, you will simply create a new shard by name, and your application will start populating future documents with the new name.

So, in your situation, if you already chose compositeId, you cannot add shards, you can only split existing ones. If you think your shards will change much in the future, I'd suggest you re-build your cloud using implicit sharding.

check out Solr collection Api for more details : https://cwiki.apache.org/confluence/display/solr/Collections+API



来源:https://stackoverflow.com/questions/30859799/how-to-add-a-node-to-solrcloud-dynamically-without-splitshard

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!