Solr query for matching nested/relational data

耗尽温柔 提交于 2019-12-09 15:59:13

问题


I'm using apache solr for the matching functionality of my webapp, and I encountered a problem of this scenario:

I got three programmer, the skill field are their skills, "weight" means how well that skill he/she has:

{
    name: "John",
    skill: [
        {name: "java", weight: 90},
        {name: "oracle", weight: 90},
        {name: "linux", weight: 70}
    ]
},
{
    name: "Sam",
    skill: [
        {name: "C#", weight: 98},
        {name: "java", weight: 75},
        {name: "oracle", weight: 70},
        {name: "tomcat", weight: 70},
    ]
},
{
    name: "Bob",
    skill: [
        {name: "oracle", weight: 90},
        {name: "java", weight: 85}
    ]
}

and I have a job seeking for programmer:

{
    name: "webapp development",
    skillRequired: [
        {name: "java", weight: 85},
        {name: "oracle", weight: 85},
    ]
}

I want use the job's "skillRequired" to match those programmer(to find the best guys for the job). In this case, it should be John and Bob, Sam was kicked off cause his java and oracle skill is not good enough. and John should scored higher than Bob, cause he know oracle better.

problem is, solr can't index nested object, the best format I think I can get is:

name: "John",
skill-name: ["java", "oracle", "linux"],
skill-weight: [90, 90, 70]

and so on. so I don't know if that possible to construct a query to get this scenario working.

Is there a better schema structure for it? or using index/query time boost?

I read almost all of the solr wiki and google around with no luck, any tips and workaround is welcomed.

Problem solved, Log my solution here for help:

1st, My data format is json, so I need solr-4.8.0 for support index nested data with json. if the data was xml format, solr-4.7.2 still work.

2nd, solr-4.8.0 need java7-u55 (official recommended)

3rd, nested document/object should submitted to solr with "childDocuments" key. and for identify the type of parent/child document, I add and "type" field . so with the example above, it seems like this:

   {
        type: "programmer",
        name: "John",
        _childDocuments_: [
            {type:"skill", name: "java", weight: 90},
            {type:"skill", name: "oracle", weight: 90},
            {type:"skill", name: "linux", weight: 70}
        ]
    },
    {
        type: "programmer",
        name: "Sam",
        _childDocuments_: [
            {type:"skill",name: "C#", weight: 98},
            {type:"skill", name: "java", weight: 75},
            {type:"skill", name: "oracle", weight: 70},
            {type:"skill", name: "tomcat", weight: 70},
        ]
    },
    {
        type: "programmer",
        name: "Bob",
        _childDocuments_: [
            {type:"skill", name: "oracle", weight: 90},
            {type:"skill", name: "java", weight: 85}
        ]
    }

4th, after submit and commit to solr, I can match the job with block join query (in filter query):

fq={!parent which='type:programmer'}type:skill AND name:java AND weight:[85 TO *]&
fq={!parent which='type:programmer'}type:skill AND name:oracle AND weight:[85 TO *]

回答1:


You can try BlockJoinQuery. Refer here



来源:https://stackoverflow.com/questions/23594391/solr-query-for-matching-nested-relational-data

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!