Keep Solr slaves in sync

末鹿安然 提交于 2019-12-10 11:18:39

问题


We have a master-slave setup running Solr 6.5.0. There is a backend process running 24/7 which pushes its data towards the master server. No commit is done on master. The web frontend is accessing the slave. Replication poll interval is 1 hour.

All is fine so far, but now as the traffic grows, the CPU load on slave is really high. I thought the best thing would be to add a second slave to the master and let the web servers connect via existing load balancers to the two Solr slave machines. I think that the two Solr slaves will handle their replication independently and each slave will poll the master at another time.

As the master receives 24/7 new data I'm worried that both machines do not have the same data set/version. Is there a solution with low administration effort to force both slaves polling new data from master at the same time? (I.e. I'm trying to avoid setting up a real Solr cluster as multiple slaves will fit our needs.)


回答1:


The problem here is following, during your poll interval, potentially, your slaves could be out-of-sync. In your case you have 1 hour interval.

The thing which could be done with minimal effort is following, you could force replication on slaves at the same time by calling the command:

http://slave_host:port/solr/core_name/replication?command=fetchindex

However, I'm not sure how often you could call this command, since most likely you couldn't do it every minute or so.

Another possibility is to trigger replication whenever a commit is performed on the master index. You could do this by adding configuration:

<str name="replicateAfter">commit</str>

For more information about it take a look here




回答2:


The traditional master-slave is basically doing rsync over http. So, maybe you can rsync between slaves (and reload cores after rsync).



来源:https://stackoverflow.com/questions/47771564/keep-solr-slaves-in-sync

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!