发表新帖

发表新帖

Spark Failure : Caused by: org.apache.spark.shuffle.FetchFailedException: Too large frame: 5454002341

后端未结

关注

 5  1161

遥遥无期 2021-01-06 07:51

I am generating a hierarchy for a table determining the parent child.

Below is the configuration used, even after getting the error with regards to the too large fra

5条回答

情歌与酒 (楼主)

2021-01-06 08:16
I was experiencing the same issue while I was working on a ~ 700GB dataset. Decreasing spark.maxRemoteBlockSizeFetchToMem didn't help in my case. In addition, I wasn't able to increase the amount of partitions.

Doing the following worked for me:
1. Increasing spark.network.timeout (default value is 120 seconds in Spark 2.3) which is affecting the following:
```
spark.core.connection.ack.wait.timeout
spark.storage.blockManagerSlaveTimeoutMs
spark.shuffle.io.connectionTimeout
spark.rpc.askTimeout
spark.rpc.lookupTimeout
```
1. Setting spark.network.timeout=600s (default is 120s in Spark 2.3)
2. Setting spark.io.compression.lz4.blockSize=512k (default is 32k in Spark 2.3)
3. Setting spark.shuffle.file.buffer=1024k(default is 32k in Spark 2.3)
0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...

热议问题