How to avoid Spark executor from getting lost and yarn container killing it due to memory limit?

前端未结

关注

 2  1647

慢半拍i 2020-12-13 20:50

I have the following code which fires hiveContext.sql() most of the time. My task is I want to create few tables and insert values into after processing for all

2条回答

[愿得一人] (楼主)

2020-12-13 21:19
This is my assumption: you must be having limited executors on your cluster and job might be running in shared environment.

As you said, your file size is small, you can set a smaller number of executors and increase executor cores and setting the memoryOverhead property is important here.
1. Set number of executors = 5
2. Set number of execuotr cores = 4
3. Set memory overhead = 2G
4. shuffle partition = 20 (to use maximum parallelism based on executors and cores)
Using above property I am sure you will avoid any executor out of memory issues without compromising performance.
0 讨论(0)

查看其它2个回答
发布评论:

提交评论
- 加载中...