UTF8 encoding is longer than the max length 32766

前端 未结 10 963
鱼传尺愫
鱼传尺愫 2020-11-29 01:39

I\'ve upgraded my Elasticsearch cluster from 1.1 to 1.2 and I have errors when indexing a somewhat big string.

{
  \"error\": \"IllegalArgumentException[Docu         


        
10条回答
  •  眼角桃花
    2020-11-29 02:34

    Using logstash to index those long messages, I use this filter to truncate the long string :

        filter {
            ruby {
                code => "event.set('message_size',event.get('message').bytesize) if event.get('message')"
            }
            ruby {
                code => "
                    if (event.get('message_size'))
                        event.set('message', event.get('message')[0..9999]) if event.get('message_size') > 32000
                        event.tag 'long message'  if event.get('message_size') > 32000
                    end
                "
             }
         }
    

    It adds a message_size field so that I can sort the longest messages by size.

    It also adds the long message tag to those that are over 32000kb so I can select them easily.

    It doesn't solve the problem if you intend to index those long messages completely, but if, like me, don't want to have them in elasticsearch in the first place and want to track them to fix it, it's a working solution.

提交回复
热议问题