How to insert a scripted field using igestion pipeline

我只是一个虾纸丫 提交于 2021-01-28 19:15:10

问题


So I have two fields in my docs

{
    emails: ["", "", ""]
    name: "",
 }

And I want to have a new field once the docs are indexed called uid which will just contain the concatenated strings of all the emails and the name for every doc.

I am able to get scripted field like that using this GET request on my index _search endpoint

{
 "script_fields": {
"combined": {
    "script": {
      "lang": "painless",
      "source": "def result=''; for (String email: doc['emails.keyword']) { result = result + email;} return doc['name'].value + result;"
    }
}
  }
 }

I want to know what my ingest pipeline PUT request body should look like if I want to have the same scripted field indexed with my docs?


回答1:


Let's say I have the below sample index and sample document.

Sample Source Index

For the sake of understanding, I've created the below mapping.

PUT my_source_index
{
  "mappings": {
    "properties": {
      "email":{
        "type":"text"
      },
      "name":{
        "type": "text"
      }
    }
  }
}

Sample Document:

POST my_source_index/_doc/1
{
  "email": ["john@gmail.com","doe@outlook.com"],
  "name": "johndoe"
}

Just follow the below steps

Step 1: Create Ingest Pipeline

PUT _ingest/pipeline/my-pipeline-concat
{
  "description" : "describe pipeline",
  "processors" : [
    {
      "join": {
        "field": "email",
        "target_field": "temp_uuid",
        "separator": "-"
      }
    },
    {
      "set": {
        "field": "uuid",
        "value": "{{name}}-{{temp_uuid}}"
      }
    },
    {
      "remove":{
        "field": "temp_uuid"
      }
    }
  ]
}

Notice that I've made use of Ingest API where I am using three processors while creating the above pipeline which would be executed in sequence:

  • The first processor is a Join Processor, which concatenates all the email ids and creates temp_uuid.

  • Second Processor is a Set Processor, I am combining name with temp_uuid.

  • And in the third step, I am removing the temp_uuid using Remove Processor

Note that I am using - as delimiter between all values. You can feel free to use anything you want.

Step 2: Create Destination Index:

PUT my_dest_index
{
  "mappings": {
    "properties": {
      "email":{
        "type":"text"
      },
      "name":{
        "type": "text"
      },
      "uuid":{                  <--- Do not forget to add this
        "type": "text"
      }
    }
  }
}

Step 3: Apply Reindex API:

POST _reindex
{
  "source": {
    "index": "my_source_index"
  },
  "dest": {
    "index": "my_dest_index",
    "pipeline": "my-pipeline-concat"       <--- Make sure you add pipeline here
  } 
}

Note how I've mentioned the pipeline while using Reindex API

Step 4: Verify Destination Index:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "my_dest_index",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "johndoe",
          "uuid" : "johndoe-john@gmail.com-doe@outlook.com",   <--- Note this
          "email" : [
            "john@gmail.com",
            "doe@outlook.com"
          ]
        }
      }
    ]
  }
}

Hope this helps!



来源:https://stackoverflow.com/questions/60629839/how-to-insert-a-scripted-field-using-igestion-pipeline

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!