Spark cluster full of heartbeat timeouts, executors exiting on their own
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试): 问题: My Apache Spark cluster is running an application that is giving me lots of executor timeouts: 10:23:30,761 ERROR ~ Lost executor 5 on slave2.cluster: Executor heartbeat timed out after 177005 ms 10:23:30,806 ERROR ~ Lost executor 1 on slave4.cluster: Executor heartbeat timed out after 176991 ms 10:23:30,812 ERROR ~ Lost executor 4 on slave6.cluster: Executor heartbeat timed out after 176981 ms 10:23:30,816 ERROR ~ Lost executor 6 on slave3.cluster: Executor heartbeat timed out after 176984 ms 10:23:30,820 ERROR ~ Lost executor 0 on slave5