I have an ES cluster with 4 nodes:
number_of_replicas: 1
search01 - master: false, data: false
search02 - master: true, data: true
search03 - master: false,
OK, I've solved this with some help from ES support. Issue the following command to the API on all nodes (or the nodes you believe to be the cause of the problem):
curl -XPUT 'localhost:9200//_settings' \
-d '{"index.routing.allocation.disable_allocation": false}'
where is the index you believe to be the culprit. If you have no idea, just run this on all nodes:
curl -XPUT 'localhost:9200/_settings' \
-d '{"index.routing.allocation.disable_allocation": false}'
I also added this line to my yaml config and since then, any restarts of the server/service have been problem free. The shards re-allocated back immediately.
FWIW, to answer an oft sought after question, set MAX_HEAP_SIZE to 30G unless your machine has less than 60G RAM, in which case set it to half the available memory.