Extract record from multiple arrays based on a filter

后端 未结 1 1527
长情又很酷
长情又很酷 2020-12-20 05:52

I have documents in ElasticSearch with the following structure :

\"_source\": {
          \"last_updated\": \"2017-10-25T18:33:51.434706\",
          \"coun         


        
相关标签:
1条回答
  • 2020-12-20 06:12

    My best approach: go nested with Nested Datatype

    Except for easier querying, it easier to read and understand the connections between those objects that are, currently, scattered in different arrays.

    Yes, if you'll decide this approach you will have to edit your mapping and re-index your entire data.

    How would the mapping is going to look like? something like this:

    {
      "mappings": {
        "properties": {
          "last_updated": {
            "type": "date"
          },
          "country": {
            "type": "string"
          },
          "records": {
            "type": "nested",
            "properties": {
              "price": {
                "type": "string"
              },
              "max_occupancy": {
                "type": "long"
              },
              "type": {
                "type": "string"
              },
              "availability": {
                "type": "long"
              },
              "size": {
                "type": "string"
              }
            }
          }
        }
      }
    }
    

    EDIT: New document structure (containing nested documents) -

    {
      "last_updated": "2017-10-25T18:33:51.434706",
      "country": "Italia",
      "records": [
        {
          "price": "€ 139",
          "max_occupancy": 2,
          "type": "Type 1",
          "availability": 10,
          "size": "26 m²"
        },
        {
          "price": "€ 125",
          "max_occupancy": 2,
          "type": "Type 1 - (Tag)",
          "availability": 10,
          "size": "35 m²"
        },
        {
          "price": "€ 120",
          "max_occupancy": 1,
          "type": "Type 2",
          "availability": 10,
          "size": "47 m²"
        },
        {
          "price": "€ 108",
          "max_occupancy": 1,
          "type": "Type 2 (Tag)",
          "availability": 10,
          "size": "31 m²"
        }
      ]
    }
    

    Now, its more easy to query for any specific condition with Nested Query and Inner Hits. for example:

    {
      "_source": [
        "last_updated",
        "country"
      ],
      "query": {
        "bool": {
          "must": [
            {
              "term": {
                "country": "Italia"
              }
            },
            {
              "nested": {
                "path": "records",
                "query": {
                  "bool": {
                    "must": [
                      {
                        "range": {
                          "records.max_occupancy": {
                            "gte": 2
                          }
                        }
                      }
                    ]
                  }
                },
                "inner_hits": {
                  "sort": {
                    "records.price": "asc"
                  },
                  "size": 1
                }
              }
            }
          ]
        }
      }
    }
    

    Conditions are: Italia AND max_occupancy > 2.

    Inner hits: sort by price ascending order and get the first result.

    Hope you'll find it useful

    0 讨论(0)
提交回复
热议问题