Elasticsearch relationship mappings (one to one and one to many)

天大地大妈咪最大 提交于 2019-12-17 15:35:58

问题


In my elastic search server I have one index http://localhost:9200/blog.
The (blog) index contains multiple types.

e.g.: http://localhost:9200/blog/posts, http://localhost:9200/blog/tags.

In the tags type I have created more than 1000 tags and 10 posts created in posts type.

e.g.: posts

{   
    "_index":"blog",
    "_type":"posts",
    "_id":"1",
    "_version":3,
    "found":true,
    "_source" : {
        "catalogId" : "1",
       "name" : "cricket",
       "url" : "http://www.wikipedia/cricket"
    }
}

e.g.: tags

{   
    "_index":"blog",
    "_type":"tags",
    "_id":"1",
    "_version":3,
    "found":true,
    "_source" : {
        "tagId" : "1",
        "name" : "game"
    }
}

I want to assign the existing tag to blog posts (i.e. relationship => mapping).

How do I assign the tags to posts mapping?


回答1:


There are 4 approaches that you can use within Elasticsearch for managing relationships. They are very well outlined in the Elasticsearch blog post - Managing Relations Inside Elasticsearch I would recommend reading the entire article to get more details on each approach and then select that approach that best meets your business needs while remaining technically appropriate.

Here are the highlights for the 4 approaches.

Inner Object

  • Easy, fast, performant
  • Only applicable when one-to-one relationships are maintained
  • No need for special queries

Nested

  • Nested docs are stored in the same Lucene block as each other, which helps read/query performance. Reading a nested doc is faster than the equivalent parent/child.
  • Updating a single field in a nested document (parent or nested children) forces ES to reindex the entire nested document. This can be very expensive for large nested docs
  • “Cross referencing” nested documents is impossible
  • Best suited for data that does not change frequently

Parent/Child

  • Children are stored separately from the parent, but are routed to the same shard. So parent/children are slightly less performance on read/query than nested
  • Parent/child mappings have a bit extra memory overhead, since ES maintains a “join” list in memory
  • Updating a child doc does not affect the parent or any other children, which can potentially save a lot of indexing on large docs
  • Sorting/scoring can be difficult with Parent/Child since the Has Child/Has Parent operations can be opaque at times

Denormalization

  • You get to manage all the relations yourself!
  • Most flexible, most administrative overhead
  • May be more or less performant depending on your setup


来源:https://stackoverflow.com/questions/23403149/elasticsearch-relationship-mappings-one-to-one-and-one-to-many

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!