Spark Exception : Task failed while writing rows

前端 未结 5 1455
南旧
南旧 2021-01-02 17:13

I am reading text files and converting them to parquet files. I am doing it using spark code. But when i try to run the code I get following exception

org.ap         


        
相关标签:
5条回答
  • 2021-01-02 17:40

    Another possible reason is that you're hitting s3 request rate limits. If you look closely at your logs you may see something like this

    AmazonS3Exception: Please reduce your request rate.

    While the Spark UI will say

    Task failed while writing rows

    I doubt its the reason you're getting an issue, but its a possible reason if you're running a highly intensive job. So I included just for answer's completeness.

    0 讨论(0)
  • 2021-01-02 17:43

    In my case, I saw this error when I tried to overwrite hdfs directory which belonged to a different user. Deleting the directory a letting my process write it from scratch solved it. So I guess, more digging is appropriate in direction of user permissions on hdfs.

    0 讨论(0)
  • 2021-01-02 17:45

    If ever it is still relavant, the experience I had with this issue was that I did not start hadoop. If you run spark on top of it, it might be worth starting hadoop and check again.

    0 讨论(0)
  • 2021-01-02 17:56

    I found that disabling speculation prevents this error from happening. I'm not very sure why. It seems that speculative and non-speculative tasks are conflicting when writing parquet rows.

    sparkConf.set("spark.speculation","false")
    
    0 讨论(0)
  • 2021-01-02 18:05

    This is where having all the source to hand helps: paste the stack trace in an IDE that can go from stack trace to lines of code, and see what it says. It's probably just some init/config problem

    0 讨论(0)
提交回复
热议问题