from datetime import datetime
from elasticsearch import Elasticsearch
es = Elasticsearch()
doc = {
\'author\': \'kimchy\',
\'text\': \'Elasticsearch: cool.
Two options that help:
Setting a timeout solved this problem for me. Note that newer versions need a unit, e.g. timeout="60s"
:
es.index(index=index_name, doc_type="domains", id=domain.id, body=body, timeout="60s")
Without a unit, for example by setting timeout=60
, you'll get
elasticsearch.exceptions.RequestError: RequestError(400, 'illegal_argument_exception', 'failed to parse setting [timeout] with value [60] as a time value: unit is missing or unrecognized')
It also helps to reduce the text length, e.g. by cutting of long texts, so elastic can store the text faster which will avoid timeouts, too:
es.index(index=index_name, doc_type="domains", id=domain.id, body=text[:5000], timeout="60s")
Note that one of the common reasons for timeouts when doing es.search
(or es.index
) is large query size. For example, in my case of a pretty large ES index size (> 3M documents), doing a search for a query with 30 words took around 2 seconds, while doing a search for a query with 400 words took over 18 seconds. So for a sufficiently large query even timeout=30 won't save you. An easy solution is to crop the query to the size that can be answered below the timeout.
Increasing timeout or doing retries on timeout will help you if the cause was in traffic, otherwise this might be your culprit.
my personal problem was solved with (timeout = 10000)
which was practically never reached because the entries on server were only 7.000 but it had heavy traffic and its resources were being hogged and that was why the connection was dropping