My current task on hand is to figure out the best approach to load millions of documents in solr. The data file is an export from DB in csv format.
Currently, I am t
Definitely just load these into a normal database first. There's all sorts of tools for dealing with CSVs (for example, postgres' COPY), so it should be easy. Using Data Import Handler is also pretty simple, so this seems like the most friction-free way to load your data. This method will also be faster since you won't have unnecessary network/HTTP overhead.