How to make elasticsearch add the timestamp field to every document in all indices?

后端 未结 5 1572
-上瘾入骨i
-上瘾入骨i 2020-12-08 00:38

Elasticsearch experts,

I have been unable to find a simple way to just tell ElasticSearch to insert the _timestamp field for all the documents that are added in all

相关标签:
5条回答
  • 2020-12-08 00:58

    An example for ElasticSearch 6.6.2 in Python 3:

    from elasticsearch import Elasticsearch
    
    es = Elasticsearch(hosts=["localhost"])
    
    timestamp_pipeline_setting = {
      "description": "insert timestamp field for all documents",
      "processors": [
        {
          "set": {
            "field": "ingest_timestamp",
            "value": "{{_ingest.timestamp}}"
          }
        }
      ]
    }
    
    es.ingest.put_pipeline("timestamp_pipeline", timestamp_pipeline_setting)
    
    conf = {
        "settings": {
            "number_of_shards": 2,
            "number_of_replicas": 1,
            "default_pipeline": "timestamp_pipeline"
        },
        "mappings": {
            "articles":{
                "dynamic": "false",
                "_source" : {"enabled" : "true" },
                "properties": {
                    "title": {
                        "type": "text",
                    },
                    "content": {
                        "type": "text",
                    },
                }
            }
        }
    }
    
    response = es.indices.create(
        index="articles_index",
        body=conf,
        ignore=400 # ignore 400 already exists code
    )
    
    print ('\nresponse:', response) 
    
    doc = {
        'title': 'automatically adding a timestamp to documents',
        'content': 'prior to version 5 of Elasticsearch, documents had a metadata field called _timestamp. When enabled, this _timestamp was automatically added to every document. It would tell you the exact time a document had been indexed.',
    }
    res = es.index(index="articles_index", doc_type="articles", id=100001, body=doc)
    print(res)
    
    res = es.get(index="articles_index", doc_type="articles", id=100001)
    print(res)
    

    About ES 7.x, the example should work after removing the doc_type related parameters as it's not supported any more.

    0 讨论(0)
  • 2020-12-08 01:07

    Adding another way to get indexing timestamp. Hope this may help someone.

    Ingest pipeline can be used to add timestamp when document is indexed. Here, is a sample example:

    PUT _ingest/pipeline/indexed_at
    {
      "description": "Adds indexed_at timestamp to documents",
      "processors": [
        {
          "set": {
            "field": "_source.indexed_at",
            "value": "{{_ingest.timestamp}}"
          }
        }
      ]
    }
    

    Earlier, elastic search was using named-pipelines because of which 'pipeline' param needs to be specified in the elastic search endpoint which is used to write/index documents. (Ref: link) This was bit troublesome as you would need to make changes in endpoints on application side.

    With Elastic search version >= 6.5, you can now specify a default pipeline for an index using index.default_pipeline settings. (Refer link for details)

    Here is the to set default pipeline:

    PUT ms-test/_settings
    {
      "index.default_pipeline": "indexed_at"
    }
    

    I haven't tried out yet, as didn't upgraded to ES 6.5, but above command should work.

    0 讨论(0)
  • 2020-12-08 01:17

    You can do this by providing it when creating your index.

    $curl -XPOST localhost:9200/test -d '{
    "settings" : {
        "number_of_shards" : 1
    },
    "mappings" : {
        "_default_":{
            "_timestamp" : {
                "enabled" : true,
                "store" : true
            }
        }
      }
    }'
    

    That will then automatically create a _timestamp for all stuff that you put in the index. Then after indexing something when requesting the _timestamp field it will be returned.

    0 讨论(0)
  • 2020-12-08 01:17

    first create index and properties of the index , such as field and datatype and then insert the data using the rest API.

    below is the way to create index with the field properties.execute the following in kibana console

    `PUT /vfq-jenkins
    {
    "mappings": {
    "properties": {
    "BUILD_NUMBER": { "type" : "double"},
    "BUILD_ID" : { "type" : "double" },
    "JOB_NAME" : { "type" : "text" },
    "JOB_STATUS" : { "type" : "keyword" },
    "time" : { "type" : "date" }
     }}}`    
    

    the next step is to insert the data into that index:

    curl -u elastic:changeme -X POST http://elasticsearch:9200/vfq-jenkins/_doc/?pretty 
    -H Content-Type: application/json -d '{ 
    "BUILD_NUMBER":"83","BUILD_ID":"83","JOB_NAME":"OMS_LOG_ANA","JOB_STATUS":"SUCCESS" , 
    "time" : "2019-09-08'T'12:39:00" }'
    
    0 讨论(0)
  • 2020-12-08 01:18

    Elasticsearch used to support automatically adding timestamps to documents being indexed, but deprecated this feature in 2.0.0

    From the version 5.5 documentation:

    The _timestamp and _ttl fields were deprecated and are now removed. As a replacement for _timestamp, you should populate a regular date field with the current timestamp on application side.

    0 讨论(0)
提交回复
热议问题