How to retrieve all documents(size greater than 10000) in an elasticsearch index

好久不见. 提交于 2020-01-16 09:03:44

问题


I am trying to get all documents in an index, I tried the following-

1) getting the total number of records first and then setting /_search?size= parameter -doesn't work as size parameter is restricted to 10000

2)tried paginating by making multiple calls and used the parameters '?size=1000&from=9000' -worked till 'from' was < 9000 but after it exceeds 9000 i again get this size restriction error-

"Result window is too large, from + size must be less than or equal to: [10000] but was [100000]. See the scroll api for a more efficient way to request large data sets. This limit can be set by changing the [index.max_result_window] index level setting"

So how can I retrieve all documents in the index?I read some answers suggesting to use the scroll api and even the documentation states -

"While a search request returns a single “page” of results, the scroll API can be used to retrieve large numbers of results (or even all results) from a single search request, in much the same way as you would use a cursor on a traditional database."

But I couldn't find any sample query to get all records in a single request.

I have a total of 388794 documents in the index. Also note, this is a one time call so I am not worried about performance concerns.


回答1:


Figured out the solution- Scroll api is the proper way to do it- here's how its working-

In the first call to fetch the documents, a size say 1000 can be provided and scroll parameter specifying the time in minutes after which search context times out.

POST /index/type/_search?scroll=1m
{
    "size": 1000,
    "query": {....
    }
}

For all subsequent calls we can use the scroll_id returned in the response of the first call to get the nest chunk of records.

POST /_search/scroll 
{
    "scroll" : "1m", 
    "scroll_id" : "DnF1ZXJ5VGhIOLSJJKSVNNZZND344D123RRRBNMBBNNN===" 
}


来源:https://stackoverflow.com/questions/58713268/how-to-retrieve-all-documentssize-greater-than-10000-in-an-elasticsearch-index

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!