How to deduplicate documents while indexing into elasticsearch from logstash

后端 未结 2 1881
别跟我提以往
别跟我提以往 2020-12-31 16:52

I\'m using Logstash 1.4.1 together with ES1.01 and would like to replace already indexed documents based on a calculated checksum. I\'m currently using the \"fingerprint\" f

2条回答
  •  半阙折子戏
    2020-12-31 17:32

    I would use the document_id parameter in your logstash elasticsearch output section:

    document_id

    Value type is string
    Default value is nil
    

    The document ID for the index. Useful for overwriting existing entries in Elasticsearch with the same ID.

    https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-document_id

    I believe the entry should be something like this:

    document_id => "%{fingerprint}"
    

    It uses logstash's sprintf format to replace a string with the contents of a field:

    https://www.elastic.co/guide/en/logstash/current/event-dependent-configuration.html#sprintf

提交回复
热议问题