I am reading text files and converting them to parquet files. I am doing it using spark code. But when i try to run the code I get following exception
org.ap
Another possible reason is that you're hitting s3 request rate limits. If you look closely at your logs you may see something like this
AmazonS3Exception: Please reduce your request rate.
While the Spark UI will say
Task failed while writing rows
I doubt its the reason you're getting an issue, but its a possible reason if you're running a highly intensive job. So I included just for answer's completeness.
In my case, I saw this error when I tried to overwrite hdfs directory which belonged to a different user. Deleting the directory a letting my process write it from scratch solved it. So I guess, more digging is appropriate in direction of user permissions on hdfs.
If ever it is still relavant, the experience I had with this issue was that I did not start hadoop. If you run spark on top of it, it might be worth starting hadoop and check again.
I found that disabling speculation prevents this error from happening. I'm not very sure why. It seems that speculative and non-speculative tasks are conflicting when writing parquet rows.
sparkConf.set("spark.speculation","false")
This is where having all the source to hand helps: paste the stack trace in an IDE that can go from stack trace to lines of code, and see what it says. It's probably just some init/config problem