Getting “No space left on device” for approx. 10 GB of data on EMR m1.large instances

一曲冷凌霜 提交于 2019-12-04 06:47:30

With the help of @slayedbylucifer I was able to identify the problem was that the complete disk space is made available to the HDFS on the cluster by default. Hence, there is the default 10GB of space mounted on / available for local use by the machine. There is an option called --mfs-percentage which can be used (while using MapR distribution of Hadoop) to specify the split of disk space between the local filesystem and HDFS. It mounts the local filesystem quota at /var/tmp. Make sure that the option mapred.local.dir is set to a directory inside /var/tmp because that is where all the logs of the tasktracker attempts go in which can be huge in size for big jobs. The logging in my case was causing the disk space error. I set the value of --mfs-percentage to 60 and was able to run the job successfully thereafter.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!