PySpark adding executors makes app slower

问题

Whenever I add more than 10 executors my jobs start to become a lot slower. Greater than 15 executors and my jobs start to crash. I generally use 4 cores per executor but have tried 2-5. I am using yarn and PySpark 2.1. Errors I receive:

ERROR TransportRequestHandler: Error sending result RpcResponse

WARN NettyRpcEndpointRef: Error sending message

Future timed out after [10 seconds]

I have read that most people get this error becomes of OOM errors but that is not in my stderr logs anywhere. I have tried changing spark.executor.heartbeatInterval to 30s and that makes the Future timed out warning message less frequent but the results are the same.

I have tried to get better results using different number of partitions varying from 30 to 1000. I have tried increasing my executor memory to 10g even though I don't think that is the problem. I have tried using small datasets of only a few megabytes to larger datasets of 50gb. The only time I can get a lot of executors to work is when I am doing a very simple job like reading in files and writing them somewhere else. In this situation the executors don't have to swap data and so I'm wondering if somehow that is the problem. Every other job where I do any aggregation or collecting or basically anything else I try gives me the same errors, or at least extremely slow execution. I just am hoping there is some other suggestion that I can try.

回答1:

During the time of allocating resources, you have to look for hardware settings of your cluster mainly. The optimum provisioning is quite of a tricky thing.

Number of Nodes
VCores
Memory in each

Depending upon these 3, you have to decide the following

num-executors
executor-cores
executor-memory

Most of the time, setting up --executor-cores to more than 5, gives degraded performance. So, set it to 5.

Set num-executors = [{Number of Nodes * (VCores - 1)} / executor-cores] - 1

The simple rule behind this is set aside one Core for Yarn/Hadoop daemons and one executor for AppMaster.

Set executor-memory = [(M - 1) * {(VCores - 1) / executor-cores}] * (1 - 0.07)

Here, 0.07 is the Off Heap Memory which you have to set aside.

Again, this formulas are nowhere etched in stone so set it according to your use case. These were nothing but some generic rule that I follow.

Hope, this helps.

来源：https://stackoverflow.com/questions/48236440/pyspark-adding-executors-makes-app-slower

标签

apache-spark

pyspark