Indexing nested documents in Solr

孤者浪人 提交于 2019-12-13 12:40:15

问题


I've seen that Solr will allow you to index JSON: http://wiki.apache.org/solr/UpdateJSON

However, none of the examples are nested. Can you index something like this and if not how is it normally handled?

{
  name: 'ben',
  state: 'california',
  country: 'united states',
  companies: [
    {
      name: 'google',
      title: 'software engineer',
    },
    {
      name: 'sherwin-williams',
      title: 'web developer'
    }
  ],
}

回答1:


There are a couple ways to go. A json string can be stored explicitly, with serialization handled in the application layer. Elasticsearch uses this approach transparently.

For indexing, you can flatten the data using naming conventions. Mongodb uses such a syntax.

companies.name: ['google', 'sherwin-williams']
companies.title: ['software engineer', 'web developer']

Note in such a case a query like

<BooleanQuery: +companies.name:google +companies:web developer>

would match. If the position should matter, a more advanced SpanQuery would have to be used.




回答2:


I had the same issue. We wanted to index in solr complicated json documents with arrays and maps ( much more complicated than the example that you posted).

At the end I modified the JsonLoader class to accept this kind of docuemnts. What it does , it flatten the json structure and allows the indexing of the fields and keeps the original json structure [company]. Finally it supports deep nesting

you can find the source code with some explanation on

http://www.solrfromscratch.com/2014/08/20/embedded-documents-in-solr/

On your example it will store/index [based on how you configure the fields] the following structure

name: 'ben',
state: 'california',
country: 'united states',
companies.0.name: 'google',
companies.0.title: 'software engineer',
companies.1.name: 'sherwin-williams',
companies.1.title: 'web developer'
companies_json:[
    {
      name: 'google',
      title: 'software engineer',
    },
    {
      name: 'sherwin-williams',
      title: 'web developer'
    }
  ]    

M.




回答3:


Nested Jsons can be indexed with the help of child documents in solr. We can make use of Block and join query parsers to query it.

Refer to this question



来源:https://stackoverflow.com/questions/9983061/indexing-nested-documents-in-solr

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!