Elasticsearch unassigned shards CircuitBreakingException[[parent] Data too large

霸气de小男生 提交于 2021-01-01 13:58:38

问题


I got alert stating elasticsearch has 2 unassigned shards. I made below api calls to gather more details.

    curl -s http://localhost:9200/_cluster/allocation/explain | python -m json.tool

Output below

    "allocate_explanation": "cannot allocate because allocation is not permitted to any of the nodes",
    "can_allocate": "no",
    "current_state": "unassigned",
    "index": "docs_0_1603929645264",
    "node_allocation_decisions": [
        {
            "deciders": [
                {
                    "decider": "max_retry",
                    "decision": "NO",
                    "explanation": "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2020-10-30T06:10:16.305Z], failed_attempts[5], delayed=false, details[failed shard on node [o_9jyrmOSca9T12J4bY0Nw]: failed recovery, failure RecoveryFailedException[[docs_0_1603929645264][0]: Recovery failed from {elasticsearch-data-1}{fIaSuZsNTwODgZnt90f7kQ}{Qxl9iPacQVS-tN_t4YJqrw}{IP1}{IP:9300} into {elasticsearch-data-0}{o_9jyrmOSca9T12J4bY0Nw}{1w5mgwy0RYqBQ9c-qA_6Hw}{IP}{IP:9300}]; nested: RemoteTransportException[[elasticsearch-data-1][IP:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] phase1 failed]; nested: RecoverFilesRecoveryException[Failed to transfer [129] files with total size of [4.4gb]]; nested: RemoteTransportException[[elasticsearch-data-0][IP2:9300][internal:index/shard/recovery/file_chunk]]; nested: 
CircuitBreakingException[[parent] Data too large, data for [<transport_request>] would be [1972835086/1.8gb], which is larger than the limit of [1972122419/1.8gb], real usage: [1972833976/1.8gb], new bytes reserved: [1110/1kb]]; ], allocation_status[no_attempt]]]"
                }
            ],
            "node_decision": "no",
            "node_id": "1XEXS92jTK-asdfasdfasdf",
            "node_name": "elasticsearch-data-2",
            "transport_address": "IP1:9300"
        },
        {
            "deciders": [
                {
                    "decider": "max_retry",
                    "decision": "NO",
                    "explanation": "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2020-10-30T06:10:16.305Z], failed_attempts[5], delayed=false, details[failed shard on node [o_9jyrmOSca9T12J4bY0Nw]: failed recovery, failure RecoveryFailedException[[docs_0_1603929645264][0]: Recovery failed from {elasticsearch-data-1}{fIaSuZsNTwODgZnt90f7kQ}{Qxl9iPacQVS-tN_t4YJqrw}{IP1}{IP1:9300} into {elasticsearch-data-0}{o_9jyrmOSca9T12J4bY0Nw}{1w5mgwy0RYqBQ9c-qA_6Hw}{IP2}{IP2:9300}]; nested: RemoteTransportException[[elasticsearch-data-1][IP1:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] phase1 failed]; nested: RecoverFilesRecoveryException[Failed to transfer [129] files with total size of [4.4gb]]; nested: RemoteTransportException[[elasticsearch-data-0][IP2:9300][internal:index/shard/recovery/file_chunk]]; nested: 
CircuitBreakingException[[parent] Data too large, data for [<transport_request>] would be [1972835086/1.8gb], which is larger than the limit of [1972122419/1.8gb], real usage: [1972833976/1.8gb], new bytes reserved: [1110/1kb]]; ], allocation_status[no_attempt]]]"
                },
                {
                    "decider": "same_shard",
                    "decision": "NO",
                    "explanation": "the shard cannot be allocated to the same node on which a copy of the shard already exists [[docs_0_1603929645264][0], node[fIaSuZsNTwODgZnt90f7kQ], [P], s[STARTED], a[id=stHnyqjLQ7OwFbaqs5vWqA]]"
                }
            ],
            "node_decision": "no",
            "node_id": "fIaSuZsNTwODgZnt90f7kQ",
            "node_name": "elasticsearch-data-1",
            "transport_address": "IP1:9300"
        },
        {
            "deciders": [
                {
                    "decider": "max_retry",
                    "decision": "NO",
                    "explanation": "shard has exceeded the maximum number of retries [5] on failed allocation attempts - manually call [/_cluster/reroute?retry_failed=true] to retry, [unassigned_info[[reason=ALLOCATION_FAILED], at[2020-10-30T06:10:16.305Z], failed_attempts[5], delayed=false, details[failed shard on node [o_9jyrmOSca9T12J4bY0Nw]: failed recovery, failure RecoveryFailedException[[docs_0_1603929645264][0]: Recovery failed from {elasticsearch-data-1}{fIaSuZsNTwODgZnt90f7kQ}{Qxl9iPacQVS-tN_t4YJqrw}{IP1}{IP1:9300} into {elasticsearch-data-0}{o_9jyrmOSca9T12J4bY0Nw}{1w5mgwy0RYqBQ9c-qA_6Hw}{Ip2}{IP2:9300}]; nested: RemoteTransportException[[elasticsearch-data-1][IP1:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] phase1 failed]; nested: RecoverFilesRecoveryException[Failed to transfer [129] files with total size of [4.4gb]]; nested: RemoteTransportException[[elasticsearch-data-0][IP2:9300][internal:index/shard/recovery/file_chunk]]; nested: 
CircuitBreakingException[[parent] Data too large, data for [<transport_request>] would be [1972835086/1.8gb], which is larger than the limit of [1972122419/1.8gb], real usage: [1972833976/1.8gb], new bytes reserved: [1110/1kb]]; ], allocation_status[no_attempt]]]"
                }
            ],
            "node_decision": "no",
            "node_id": "o_9jyrmOSca9T12J4bY0Nw",
            "node_name": "elasticsearch-data-0",
            "transport_address": "IP2:9300"
        }
    ],
    "primary": false,
    "shard": 0,
    "unassigned_info": {
        "at": "2020-10-30T06:10:16.305Z",
        "details": "failed shard on node [o_9jyrmOSca9T12J4bY0Nw]: failed recovery, failure RecoveryFailedException[[docs_0_1603929645264][0]: Recovery failed from {elasticsearch-data-1}{fIaSuZsNTwODgZnt90f7kQ}{Qxl9iPacQVS-tN_t4YJqrw}{IP1}{IP1:9300} into {elasticsearch-data-0}{o_9jyrmOSca9T12J4bY0Nw}{1w5mgwy0RYqBQ9c-qA_6Hw}{IP2}{IP2:9300}]; nested: RemoteTransportException[[elasticsearch-data-1][IP1:9300][internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[Phase[1] phase1 failed]; nested: RecoverFilesRecoveryException[Failed to transfer [129] files with total size of [4.4gb]]; nested: RemoteTransportException[[elasticsearch-data-0][IP2:9300][internal:index/shard/recovery/file_chunk]]; nested: 
CircuitBreakingException[[parent] Data too large, data for [<transport_request>] would be [1972835086/1.8gb], which is larger than the limit of [1972122419/1.8gb], real usage: [1972833976/1.8gb], new bytes reserved: [1110/1kb]]; ",
        "failed_allocation_attempts": 5,
        "last_allocation_status": "no_attempt",
        "reason": "ALLOCATION_FAILED"
    }
}

I queried for the breaker config

    curl -X GET "localhost:9200/_nodes/stats/breaker?pretty

And can see that the parent limit_size_in_byes of 3 nodes (elasticsearch-data-0, elasticsearch-data-1 and elasticsearch-data-2) is as below.

"parent" : {
          "limit_size_in_bytes" : 1972122419,
          "limit_size" : "1.8gb",
          "estimated_size_in_bytes" : 1648057776,
          "estimated_size" : "1.5gb",
          "overhead" : 1.0,
          "tripped" : 139
        }

I referred to this answer https://stackoverflow.com/a/61954408 and planning to increase either the memory percentage of the circuit breaker or overall JVM heap.

This is a k8s environment and the elasticsearch-data is deployed as a statefulset with 3 replicas. When I did a describe of the statefulset, I can see the below ENV variable defined

Containers:
   elasticsearch:
    Image:      custom/elasticsearch-oss-s3:7.0.0
    Port:       9300/TCP
    Host Port:  0/TCP
    Limits:
      cpu:     10500m
      memory:  21Gi
    Requests:
      cpu:      10
      memory:   20Gi
    Environment:
      DISCOVERY_SERVICE:     elasticsearch-discovery
      NODE_MASTER:           false
      PROCESSORS:            11 (limits.cpu)
      ES_JAVA_OPTS:          -Djava.net.preferIPv4Stack=true -Xms2048m -Xmx2048m

As per this, the heap size seems to be 2048m

I logged into a elasticsearch-data pod and under the elastic search config directory I see below files

elasticsearch.keystore  elasticsearch.yml  jvm.options  log4j2.properties  repository-s3

elasticsearch.yml doesn't have any heap config or as such. It just had the master nodes' name etc..

Below is the jvm.options file


## JVM configuration

# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space

-Xms1g
-Xmx1g


## GC configuration
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly


## DNS cache policy
-Des.networkaddress.cache.ttl=60
-Des.networkaddress.cache.negative.ttl=10


# pre-touch memory pages used by the JVM during initialization
-XX:+AlwaysPreTouch

## basic

# explicitly set the stack size
-Xss1m

# set to headless, just in case
-Djava.awt.headless=true

# ensure UTF-8 encoding by default (e.g. filenames)
-Dfile.encoding=UTF-8

# use our provided JNA always versus the system one
-Djna.nosys=true

# turn off a JDK optimization that throws away stack traces for common
# exceptions because stack traces are important for debugging
-XX:-OmitStackTraceInFastThrow


# flags to configure Netty
-Dio.netty.noUnsafe=true
-Dio.netty.noKeySetOptimization=true
-Dio.netty.recycler.maxCapacityPerThread=0

# log4j 2
-Dlog4j.shutdownHookEnabled=false
-Dlog4j2.disable.jmx=true

-Djava.io.tmpdir=${ES_TMPDIR}

## heap dumps

# generate a heap dump when an allocation from the Java heap fails
# heap dumps are created in the working directory of the JVM
-XX:+HeapDumpOnOutOfMemoryError

# specify an alternative path for heap dumps; ensure the directory exists and
# has sufficient space
-XX:HeapDumpPath=data

# specify an alternative path for JVM fatal error logs
-XX:ErrorFile=logs/hs_err_pid%p.log

## JDK 8 GC logging

8:-XX:+PrintGCDetails
8:-XX:+PrintGCDateStamps
8:-XX:+PrintTenuringDistribution
8:-XX:+PrintGCApplicationStoppedTime
8:-Xloggc:logs/gc.log
8:-XX:+UseGCLogFileRotation
8:-XX:NumberOfGCLogFiles=32
8:-XX:GCLogFileSize=64m

# JDK 9+ GC logging
9-:-Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m
# due to internationalization enhancements in JDK 9 Elasticsearch need to set the provider to COMPAT otherwise
# time/date parsing will break in an incompatible way for some date patterns and locals
9-:-Djava.locale.providers=COMPAT

From above, it seems the total heap size is 1g.

But from the env variable defined in the stateful set of this pod, it seems to be 2048m.

Which one is right?

Now, from the below link

Circuit breaker settings | Elasticsearch

The parent-level breaker can be configured with the following settings:

indices.breaker.total.use_real_memory (Static) Determines whether the parent breaker should take real memory usage into account (true) or only consider the amount that is reserved by child circuit breakers (false). Defaults to true.

indices.breaker.total.limit (Dynamic) Starting limit for overall parent breaker. Defaults to 70% of JVM heap if indices.breaker.total.use_real_memory is false. If indices.breaker.total.use_real_memory is true, defaults to 95% of the JVM heap.

But the limit value in the error and in the breaker stats I queried is this - 1972122419 bytes (1.8G). This doesn't seem to be 95% of 2048M or 1g.

Now, how can I increase the heap or the memory limit of breaker parent so that I can get rid of this error?


回答1:


two things here, shard allocation exception and circuit breaker exception(nested exception as it looks).

Please use the below command in your cluster to re-trigger the allocation as previous all retry was failed and the same is suggested in your exception message if you carefully notice. more info on below command is on this related Github issue comment.

curl -XPOST ':9200/_cluster/reroute?retry_failed

If still, it doesn't work, then you have to fix the parent circuit breaker exception, you should use the http://localhost:9200/_nodes/stats API to know the exact heap of your ES nodes, and accordingly increase it.



来源:https://stackoverflow.com/questions/64606301/elasticsearch-unassigned-shards-circuitbreakingexceptionparent-data-too-large

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!