elasticsearch delete documents using logstash and csv

☆樱花仙子☆ 提交于 2020-01-02 18:04:03

问题


Is there any way to delete documents from ElasticSearch using Logstash and a csv file? I read the Logstash documentation and found nothing and tried a few configs but nothing happened using action "delete"

output {
    elasticsearch{
        action => "delete"
        host => "localhost"
        index => "index_name"
        document_id => "%{id}"
    }
} 

Has anyone tried this? Is there anything special that I should add to the input and filter sections of the config? I used file plugin for input and csv plugin for filter.


回答1:


It is definitely possible to do what you suggest, but if you're using Logstash 1.5, you need to use the transport protocol as there is a bug in Logstash 1.5 when doing deletes over the HTTP protocol (see issue #195)

So if your delete.csv CSV file is formatted like this:

id
12345
12346
12347

And your delete.conf Logstash config looks like this:

input {
    file {
        path => "/path/to/your/delete.csv"
        start_position => "beginning"
        sincedb_path => "/dev/null"
    }
}
filter {
    csv {
        columns => ["id"]
    }
}
output {
    elasticsearch{
        action => "delete"
        host => "localhost"
        port => 9300                         <--- make sure you have this
        protocol => "transport"              <--- make sure you have this
        index => "your_index"                <--- replace this
        document_type => "your_doc_type"     <--- replace this
        document_id => "%{id}"
    }
}

Then when running bin/logstash -f delete.conf you'll be able to delete all the documents whose id is specified in your CSV file.




回答2:


In addition to Val's answer, I would add that if you have a single input that has a mix of deleted and upserted rows, you can do both if you have a flag that identifies the ones to delete. The output > elasticsearch > action parameter can be a "field reference," meaning that you can reference a per-row field. Even better, you can change that field to a metadata field so that it can be used in a field reference without being indexed.

For example, in your filter section:

filter {
    # [deleted] is the name of your field
    if [deleted] {
        mutate {    
            add_field => {
                "[@metadata][elasticsearch_action]" => "delete"
            }
        }
        mutate {
            remove_field => [ "deleted" ]
        }
    } else {
        mutate {    
            add_field => {
                "[@metadata][elasticsearch_action]" => "index"
            }
        }
        mutate {
            remove_field => [ "deleted" ]
        }
    }   
}

Then, in your output section, reference the metadata field:

output {
    elasticsearch {
        hosts => "localhost:9200"
        index => "myindex"
        action => "%{[@metadata][elasticsearch_action]}"
        document_type => "mytype"
    }
}


来源:https://stackoverflow.com/questions/32890374/elasticsearch-delete-documents-using-logstash-and-csv

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!