I\'ve upgraded my Elasticsearch cluster from 1.1 to 1.2 and I have errors when indexing a somewhat big string.
{
\"error\": \"IllegalArgumentException[Docu
There is a better option than the one John posted. Because with that solution you can't search on the value anymore.
Back to the problem:
The problem is that by default field values will be used as a single term (complete string). If that term/string is longer than the 32766 bytes it can't be stored in Lucene .
Older versions of Lucene only registers a warning when terms are too long (and ignore the value). Newer versions throws an Exception. See bugfix: https://issues.apache.org/jira/browse/LUCENE-5472
Solution:
The best option is to define a (custom) analyzer on the field with the long string value. The analyzer can split out the long string in smaller strings/terms. That will fix the problem of too long terms.
Don't forget to also add an analyzer to the "_all" field if you are using that functionality.
Analyzers can be tested with the REST api. http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-analyze.html