when I am trying to run it on this folder it is throwing me ExecutorLostFailure everytime
Hi I am a beginner in Spark. I am trying to run a job on S
This error is occurring because a task failed more than four times. Try increase the parallelism in your cluster using the following parameter.
--conf "spark.default.parallelism=100"
Set the parallelism value to 2 to 3 time the number of cores available on your cluster. If that doesn't work. try increase the parallelism in an exponential fashion. i.e if your current parallelism doesn't work multiply it by two and so on. Also I have observed that it helps if your level of parallelism is a prime number especially if you are using groupByKkey.