问题
How I can get 100000 registers in elasticsearch from python? MatchAll query only retrieve 10000.
回答1:
Like it's been pointed out, I'd use the Scan API to do that.
import elasticsearch
from elasticsearch import Elasticsearch
ES_HOST = {
"host": "localhost",
"port": 9200
}
ES_INDEX = "index_name"
ES_TYPE = "type_name"
es = Elasticsearch(hosts=[ES_HOST], )
results_gen = elasticsearch.helpers.scan(
es,
query={"query": {"match_all": {}}},
index=ES_INDEX,
doc_type=ES_TYPE
)
results = list(results_gen)
You ought also reading about the scan helper in elasticsearch python DSL http://elasticsearch-py.readthedocs.io/en/master/helpers.html#scan.
Ref. Helpers.
回答2:
It is forbidden to have sum of "size" and "offset" more than 10000.
You need to use scan
api. There is neat handy helper for this over there http://elasticsearch-py.readthedocs.io/en/master/helpers.html#scan
来源:https://stackoverflow.com/questions/41961245/how-to-retrieve-1m-documents-with-elasticsearch-in-python