Elasticsearch Scroll

前端未结

关注

 4  1081

日久生厌 2020-12-30 06:09

I am little bit confused over Elasticsearch by its scroll functionality. In elasticsearch is it possible to call search API everytime whenever the user scrolls on the result

4条回答

梦谈多话 (楼主)

2020-12-30 06:30
What you are looking for is pagination.

You can achieve your objective by querying for a fixed size and setting the from parameter. Since you want to set display in batches of 250 results, you can set size = 250 and with each consecutive query, increment the value of from by 250.
```
GET /_search?size=250                     ---- return first 250 results
GET /_search?size=250&from=250            ---- next 250 results 
GET /_search?size=250&from=500            ---- next 250 results
```
On the contrary, Scan & scroll lets you retrieve a large set of results with a single search and is ideally meant for operations like re-indexing data into a new index. Using it for displaying search results in real-time is not recommended.

To explain Scan & scroll briefly, what it essentially does is that it scans the index for the query provided with the scan request and returns a scroll_id. This scroll_id can be passed to the next scroll request to return the next batch of results.

Consider the following example-
```
# Initialize the scroll
page = es.search(
  index = 'yourIndex',
  doc_type = 'yourType',
  scroll = '2m',
  search_type = 'scan',
  size = 1000,
  body = {
    # Your query's body
    }
)
sid = page['_scroll_id']
scroll_size = page['hits']['total']

# Start scrolling
while (scroll_size > 0):
  print "Scrolling..."
  page = es.scroll(scroll_id = sid, scroll = '2m')
  # Update the scroll ID
  sid = page['_scroll_id']
  # Get the number of results that we returned in the last scroll
  scroll_size = len(page['hits']['hits'])
  print "scroll size: " + str(scroll_size)
  # Do something with the obtained page
```
In above example, following events happen-
- Scroller is initialized. This returns the first batch of results along with the scroll_id
- For each subsequent scroll request, the updated scroll_id (received in the previous scroll request) is sent and next batch of results is returned.
- Scroll time is basically the time for which the search context is kept alive. If the next scroll request is not sent within the set timeframe, the search context is lost and results will not be returned. This is why it should not be used for real-time results display for indexes with a huge number of docs.
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...