Ignite issue when a node in the cluster becomes unstable unable to join the cluster and hangs indefinitely

我们两清 提交于 2019-12-03 20:50:57

Partition map exchange is a process of exchanging information between nodes where each piece of data is stored. It happens every time, when topology changes.

Every node sends a GridDhtPartitionsSingleMessage to a coordinator. Once the coordinator collected all such messages, it sends GridDhtPartitionsFullMessage back to other nodes. These messages are sent over communication SPI.

But if some of non-coordinator nodes don't send the SingleMessage to the coordinator, or if the coordinator doesn't send the FullMessage, then "Failed to wait for partition map exchange" error occurs.

Judging by the piece of log, that you provided, a node with ID=ba6aba6c didn't send the SingleMessage to the coordinator. It may mean, that communication SPI doesn't work there properly. Make sure, that ports, that are required for communication SPI are available. Usually it's 47100..47200.

Also joining node may be stuck on something. Look at its log to figure out, what happens there.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!