Querying binary fields in Solr

可紊 提交于 2019-12-11 02:59:50

问题


I'm using Solr to index records consisting of binary fields. I've specified the fields in schema.xml as such:

<field name="id" type="binary" indexed="true" stored="true" required="true" multiValued="false" />

I'm able to add records to the index via a POST request, encoding and sending the fields as Base64 Strings. The size of the collection's data directory is growing so I know it is storing something; however, when doing a match all query (q=*:*) I strangely get some documents found but none returned, e.g.:

"response": {  
  "numFound": 364047,
  "start": 0,
  "maxScore": 1,
  "docs": []
}

Has anybody any idea what's causing this or how it can be resolved?
Thanks


回答1:


Short answer it cannot be solved.

When having a read in the reference documentation of Solr, you find there very few information about the BinaryField type

Class: BinaryField

Description: Binary data.

The current state is that this BinaryField is only intended for storage of binary data. Nothing more, nothing less. There is however an issue to change this, but it has not raised that much attention yet.

My personal assumption is that behind this lies the fact that binary data is just not plain and simple binary data. Most of the time it is an elaborated file format that requires special interpretation. For this task a separate Apache Project exists, Apache Tika.

To tame this beast several good articles and tutorials are spread all over the web. A good starting point how to integrate this with Solr is also found in the reference documentation (1, 2).



来源:https://stackoverflow.com/questions/32484920/querying-binary-fields-in-solr

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!