Copy data from one index to another in elastic search

与世无争的帅哥 提交于 2019-12-13 08:40:50

问题


How to copy data from one index to another index in Elastic Search (which may be present in same host or different host) ?

Output

Reading config file {:file=>"logstash/agent.rb", :level=>:debug, :line=>"309", :method=>"local_config"}
Compiled pipeline code:
        @inputs = []
        @filters = []
        @outputs = []
        @periodic_flushers = []
        @shutdown_flushers = []

          @input_elasticsearch_1 = plugin("input", "elasticsearch", LogStash::Util.hash_merge_many({ "hosts" => ("input hostname") }, { "port" => ("9200") }, { "index" => (".kibana") }, { "size" => 500 }, { "scroll" => ("5m") }, { "docinfo" => ("true") }))

          @inputs << @input_elasticsearch_1

          @output_elasticsearch_2 = plugin("output", "elasticsearch", LogStash::Util.hash_merge_many({ "host" => ("output hostname") }, { "port" => 9200 }, { "protocol" => ("http") }, { "manage_template" => ("false") }, { "index" => ("order-logs-sample") }, { "document_type" => ("logs") }, { "document_id" => ("%{id}") }, { "workers" => 1 }))

          @outputs << @output_elasticsearch_2

  def filter_func(event)
    events = [event]
    @logger.debug? && @logger.debug("filter received", :event => event.to_hash)
    events
  end
  def output_func(event)
    @logger.debug? && @logger.debug("output received", :event => event.to_hash)
    @output_elasticsearch_2.handle(event)

  end {:level=>:debug, :file=>"logstash/pipeline.rb", :line=>"29", :method=>"initialize"}
Plugin not defined in namespace, checking for plugin file {:type=>"input", :name=>"elasticsearch", :path=>"logstash/inputs/elasticsearch", :level=>:debug, :file=>"logstash/plugin.rb", :line=>"133", :method=>"lookup"}
Plugin not defined in namespace, checking for plugin file {:type=>"codec", :name=>"json", :path=>"logstash/codecs/json", :level=>:debug, :file=>"logstash/plugin.rb", :line=>"133", :method=>"lookup"}
config LogStash::Codecs::JSON/@charset = "UTF-8" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@hosts = ["ccwlog-stg1-01"] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@port = 9200 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@index = ".kibana" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@size = 500 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@scroll = "5m" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@docinfo = true {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@debug = false {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@codec = <LogStash::Codecs::JSON charset=>"UTF-8"> {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@add_field = {} {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@query = "{\"query\": { \"match_all\": {} } }" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@scan = true {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@docinfo_target = "@metadata" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@docinfo_fields = ["_index", "_type", "_id"] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Inputs::Elasticsearch/@ssl = false {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
Plugin not defined in namespace, checking for plugin file {:type=>"output", :name=>"elasticsearch", :path=>"logstash/outputs/elasticsearch", :level=>:debug, :file=>"logstash/plugin.rb", :line=>"133", :method=>"lookup"}
'[DEPRECATED] use `require 'concurrent'` instead of `require 'concurrent_ruby'`
[2016-01-22 03:49:34.451]  WARN -- Concurrent: [DEPRECATED] Java 7 is deprecated, please use Java 8.
Java 7 support is only best effort, it may not work. It will be removed in next release (1.0).
Plugin not defined in namespace, checking for plugin file {:type=>"codec", :name=>"plain", :path=>"logstash/codecs/plain", :level=>:debug, :file=>"logstash/plugin.rb", :line=>"133", :method=>"lookup"}
config LogStash::Codecs::Plain/@charset = "UTF-8" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@host = ["ccwlog-stg1-01"] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@port = 9200 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@protocol = "http" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@manage_template = false {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@index = "order-logs-sample" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@document_type = "logs" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@document_id = "%{id}" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@workers = 1 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@type = "" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@tags = [] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@exclude_tags = [] {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@codec = <LogStash::Codecs::Plain charset=>"UTF-8"> {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@template_name = "logstash" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@template_overwrite = false {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@embedded = false {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@embedded_http_port = "9200-9300" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@max_inflight_requests = 50 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@flush_size = 5000 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@idle_flush_time = 1 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@action = "index" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@path = "/" {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@ssl = false {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@ssl_certificate_verification = true {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@sniffing = false {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@max_retries = 3 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@retry_max_items = 5000 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
config LogStash::Outputs::ElasticSearch/@retry_max_interval = 5 {:level=>:debug, :file=>"logstash/config/mixin.rb", :line=>"112", :method=>"config_init"}
Normalizing http path {:path=>"/", :normalized=>"/", :level=>:debug, :file=>"logstash/outputs/elasticsearch.rb", :line=>"342", :method=>"register"}
Create client to elasticsearch server on ccwlog-stg1-01: {:level=>:info, :file=>"logstash/outputs/elasticsearch.rb", :line=>"422", :method=>"register"}
Plugin is finished {:plugin=><LogStash::Inputs::Elasticsearch hosts=>["ccwlog-stg1-01"], port=>9200, index=>".kibana", size=>500, scroll=>"5m", docinfo=>true, debug=>false, codec=><LogStash::Codecs::JSON charset=>"UTF-8">, query=>"{\"query\": { \"match_all\": {} } }", scan=>true, docinfo_target=>"@metadata", docinfo_fields=>["_index", "_type", "_id"], ssl=>false>, :level=>:info, :file=>"logstash/plugin.rb", :line=>"61", :method=>"finished"}
New Elasticsearch output {:cluster=>nil, :host=>["ccwlog-stg1-01"], :port=>9200, :embedded=>false, :protocol=>"http", :level=>:info, :file=>"logstash/outputs/elasticsearch.rb", :line=>"439", :method=>"register"}
Pipeline started {:level=>:info, :file=>"logstash/pipeline.rb", :line=>"87", :method=>"run"}
Logstash startup completed
output received {:event=>{"title"=>"logindex", "timeFieldName"=>"@timestamp", "fields"=>"[{\"name\":\"caller\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"_source\",\"type\":\"_source\",\"count\":0,\"scripted\":false,\"indexed\":false,\"analyzed\":false,\"doc_values\":false},{\"name\":\"exception\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"type\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"@version\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"serviceName\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"_type\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":false,\"doc_values\":false},{\"name\":\"_id\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":false,\"analyzed\":false,\"doc_values\":false},{\"name\":\"userId\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"path\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"orderId\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"dc\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"tags\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"host\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"_index\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":false,\"analyzed\":false,\"doc_values\":false},{\"name\":\"elapsedTime\",\"type\":\"number\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":false,\"doc_values\":false},{\"name\":\"message\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false},{\"name\":\"@timestamp\",\"type\":\"date\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":false,\"doc_values\":false},{\"name\":\"performanceRequest\",\"type\":\"string\",\"count\":0,\"scripted\":false,\"indexed\":true,\"analyzed\":true,\"doc_values\":false}]", "@version"=>"1", "@timestamp"=>"2016-01-22T11:49:35.268Z"}, :level=>:debug, :file=>"(eval)", :line=>"21", :method=>"output_func"}

回答1:


You should look at scan and scroll doc for this functionality.

You retrieve data from old index with given query and size parameter, and then bulk index into new index. Different languages provide wrapper to make reindexing easy.

e.g I use python and it has reindex helper which uses scan and scroll approach.




回答2:


A simple way to do this is to use Logstash with an elasticsearch input plugin and an elasticsearch output plugin.

The benefit of this solution is that you don't have to rewrite boilerplate code to scan/scroll and bulk re-index which is exactly what Logstash already provides.

After installing Logstash, you can create a configuration file copy.conf that looks like this:

input {
  elasticsearch {
   hosts => ["localhost:9200"]                   <--- source ES host
   index => "source_index"
  }
}
filter {
 mutate {
  remove_field => [ "@version", "@timestamp" ]   <--- remove fields added by Logstash
 }
}
output {
 elasticsearch {
   hosts => ["localhost:9200"]                   <--- target ES host
   manage_template => false
   index => "target_index"
   document_id => "%{id}"                        <--- name of your ID field
   workers => 1
 }
}

And then after setting the correct values (source/target host + source/target index), you can run this with bin/logstash -f copy.conf




回答3:


Elastic search provides a Re-index API. it helps to copy data from one index to another.but make sure Re-index does not attempt to set up the destination index. It does not copy the settings of the source index. You should set up the destination index prior to running a _reindex action, including setting up mappings, shard counts, replicas, etc.

for more information about Re-index https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html

ex-- POST _reindex { "source": { "index": "old index name" }, "dest": { "index": "new index name" } }



来源:https://stackoverflow.com/questions/34162468/copy-data-from-one-index-to-another-in-elastic-search

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!