问题
I'm having trouble using remote streaming with Apache Solr.
We previously had Solr running on the same server where the files to be indexed are located so all we had to to was pass it the path of the file we wanted to index.
We used something like this:
stream.file=/path/to/file.pdf
This worked fine. We have now moved Solr so that it runs on a different server to the website that uses it. This was because it was using up too many resources.
I'm now using the following to point Solr in the direction of the file:
stream.file=http://www.remotesite.com/path/to/file.pdf
When I do this Solr reports the following error:
http:/www.remotesite.com/path/to/file.pdf (No such file or directory)
Note that it is stripping one of the slashes from http://.
How can I get Solr to index a file at a certain URL like i'm trying to do above? The enableRemoteStreaming parameter is already set to true.
Thank you
回答1:
For remote streaming
you would need to enable remote streaming
<requestParsers enableRemoteStreaming="true" multipartUploadLimitInKB="2048" />
and probably use stream.url for urls
If remote streaming is enabled and URL content is called for during request handling, the contents of each stream.url and stream.file parameters are fetched and passed as a stream.
来源:https://stackoverflow.com/questions/16857923/remote-streaming-with-solr