Copy Solr HDFS Data to another Cluster

十年热恋 提交于 2019-12-10 20:07:33

问题


I have a solr cloud (v 4.10) installation that sits on top of Cloudera (CDH 5.4.2) HDFS with 3 solr instances each hosting a shard of each core. I am looking for a way to incrementally copy the solr data from our production cluster to our development cluster. There are 3 cores but I am only interested in copying one of them.

I have tried to use the Solr replication - backup and restore but that doesn't seem to load anything into the dev cluster.

http://host:8983/solr/core/replication?command=backup&location=/solr_transfer&name=core-name
http://host:8983/solr/core/replication?command=restore&location=/solr_transfer&name=core-name

I also tried to snapshot the /solr dir in the hdfs prod clusters and use hadoop disctp to copy the files but the solr indexer deletes some of the files so the distcp job fails.

hadoop distcp hftp://prod:50070/solr/* hdfs://dev:8020/solr/

Can anyone help me here?


回答1:


please follow below steps to create snapshot of solr_hdfs folder and move the same on another cluster

1.Allow snapshot

sudo -u hdfs hadoop dfsadmin -allowSnapshot /user/solr/SolrCollectionName

2.Create snapshot with a specific name

sudo -u hdfs hadoop dfs -createSnapshot /user/solr/SolrCollectionName/ snapshotName

3. To list to snapshot directory

hdfs dfs -ls /user/solr/solrcollectionName/.snapshot

4. To copy, execute below command

 sudo -u solr hadoop distcp hdfs://NNIP1:8020/user/solr/collectionName/.snapshot/SanpshotName  hdfs://NNIP2:8020/user/solr

5. To restore snapshot

sudo -u solr hadoop fs -cp /user/solr/SanpshotName/* /user/solr/SolrcollectionName/



回答2:


After a lot of trying this is the solution we worked out. - Initialise solr in the second environment with all the collections in the same way as the primary. - Take a snapshot of HDFS - Use hadoop hdfs -cp to copy the data up to the checkpoint After the first run the copy job will be quick as you are only copying the increments.



来源:https://stackoverflow.com/questions/34140384/copy-solr-hdfs-data-to-another-cluster

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!