_updated_by_query for re-indexing elasticsearch documents

走远了吗. 提交于 2019-12-11 14:59:03

问题


We are using elasticsearch 5.5. We made some changes to the mapping :

  1. Added some new fields.
  2. Added analyzer to some fields.
  3. Removed some fields.
  4. Excluded some existing fields from _all.
  5. Removed the same analyzer setting from some exiting fields.

Here is what I understand: For cases 2, and 4 we need to perform re-indexing for the changes to take effect.

One approach I got to know to perform re-indexing is using _update_query with conflicts=proceed.

I am only aware of the other ways: Re-index API or have found a few approaches which don't need any downtime. But if we consider the case 2, where we add analyzer to existing fields, how good or bad of an idea is it to use update_bu_query to perform re-index? We can have 10s of indices with 100 thousand docs in them.

Also, does _update_by_query work on a snapshot? i.e. Once it starts, does it continue only on the assets available in the index at the time of start? What will happen, if the new documents are being added, when this request is in progress?

Would it help case 5 effective? Would it remove analyzed data?

来源:https://stackoverflow.com/questions/58510637/updated-by-query-for-re-indexing-elasticsearch-documents

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!