Spark Exception : Task failed while writing rows

风流意气都作罢 提交于 2019-12-03 14:48:08

Another possible reason is that you're hitting s3 request rate limits. If you look closely at your logs you may see something like this

AmazonS3Exception: Please reduce your request rate.

While the Spark UI will say

Task failed while writing rows

I doubt its the reason you're getting an issue, but its a possible reason if you're running a highly intensive job. So I included just for answer's completeness.

I found that disabling speculation prevents this error from happening. I'm not very sure why. It seems that speculative and non-speculative tasks are conflicting when writing parquet rows.

sparkConf.set("spark.speculation","false")

This is where having all the source to hand helps: paste the stack trace in an IDE that can go from stack trace to lines of code, and see what it says. It's probably just some init/config problem

If ever it is still relavant, the experience I had with this issue was that I did not start hadoop. If you run spark on top of it, it might be worth starting hadoop and check again.

In my case, I saw this error when I tried to overwrite hdfs directory which belonged to a different user. Deleting the directory a letting my process write it from scratch solved it. So I guess, more digging is appropriate in direction of user permissions on hdfs.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!