I am running kinesis plus spark application https://spark.apache.org/docs/1.2.0/streaming-kinesis-integration.html
I am running as below
command on ec2 inst
Had a similar problem
Like other answer indicate here, it's a resource availability issue
In my case, I was doing an etl process where the old data from the previous run was being trashed each time. However, the newly trashed data was being stored in the controlling user's /user/myuser/.Trash
folder. Looking at the Ambari dashboard, I could see that the overall HDFS disk usage was near capacity which was causing the resource issues.
So in this case, used the -skipTrash
option to hadoop fs -rm ...
old data files (else will take up space in trash roughly equivalent to the size of all data stored in the etl storage dir (effectively doubling total the space used by application and causing resource problems)).