Extract record from multiple arrays based on a filter

荒凉一梦 提交于 2019-11-29 12:28:58

My best approach: go nested with Nested Datatype

Except for easier querying, it easier to read and understand the connections between those objects that are, currently, scattered in different arrays.

Yes, if you'll decide this approach you will have to edit your mapping and re-index your entire data.

How would the mapping is going to look like? something like this:

{
  "mappings": {
    "properties": {
      "last_updated": {
        "type": "date"
      },
      "country": {
        "type": "string"
      },
      "records": {
        "type": "nested",
        "properties": {
          "price": {
            "type": "string"
          },
          "max_occupancy": {
            "type": "long"
          },
          "type": {
            "type": "string"
          },
          "availability": {
            "type": "long"
          },
          "size": {
            "type": "string"
          }
        }
      }
    }
  }
}

EDIT: New document structure (containing nested documents) -

{
  "last_updated": "2017-10-25T18:33:51.434706",
  "country": "Italia",
  "records": [
    {
      "price": "€ 139",
      "max_occupancy": 2,
      "type": "Type 1",
      "availability": 10,
      "size": "26 m²"
    },
    {
      "price": "€ 125",
      "max_occupancy": 2,
      "type": "Type 1 - (Tag)",
      "availability": 10,
      "size": "35 m²"
    },
    {
      "price": "€ 120",
      "max_occupancy": 1,
      "type": "Type 2",
      "availability": 10,
      "size": "47 m²"
    },
    {
      "price": "€ 108",
      "max_occupancy": 1,
      "type": "Type 2 (Tag)",
      "availability": 10,
      "size": "31 m²"
    }
  ]
}

Now, its more easy to query for any specific condition with Nested Query and Inner Hits. for example:

{
  "_source": [
    "last_updated",
    "country"
  ],
  "query": {
    "bool": {
      "must": [
        {
          "term": {
            "country": "Italia"
          }
        },
        {
          "nested": {
            "path": "records",
            "query": {
              "bool": {
                "must": [
                  {
                    "range": {
                      "records.max_occupancy": {
                        "gte": 2
                      }
                    }
                  }
                ]
              }
            },
            "inner_hits": {
              "sort": {
                "records.price": "asc"
              },
              "size": 1
            }
          }
        }
      ]
    }
  }
}

Conditions are: Italia AND max_occupancy > 2.

Inner hits: sort by price ascending order and get the first result.

Hope you'll find it useful

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!