SOLR - Best approach to import 20 million documents from csv file

后端 未结 5 986
有刺的猬
有刺的猬 2020-12-29 09:29

My current task on hand is to figure out the best approach to load millions of documents in solr. The data file is an export from DB in csv format.

Currently, I am t

5条回答
  •  余生分开走
    2020-12-29 10:12

    In SOLR 4.0 (currently in BETA), CSV's from a local directory can be imported directly using the UpdateHandler. Modifying the example from the SOLR Wiki

    curl http://localhost:8983/solr/update?stream.file=exampledocs/books.csv&stream.contentType=text/csv;charset=utf-8
    

    And this streams the file from the local location, so no need to chunk it up and POST it via HTTP.

提交回复
热议问题