Deleting file/folder from Hadoop

时间秒杀一切 提交于 2019-11-30 08:18:26
greedybuddha

When you say delete from Hadoop, you really mean delete from HDFS.

To delete something from HDFS do one of the two

From the command line:

  • deprecated way:

hadoop dfs -rmr hdfs://path/to/file

  • new way (with hadoop 2.4.1) :

hdfs dfs -rm -r hdfs://path/to/file

Or from java:

FileSystem fs = FileSystem.get(getConf());
fs.delete(new Path("path/to/file"), true); // delete file, true for recursive 

To delete a file from hdfs you can use below given command :

hadoop fs -rm -r -skipTrash /path_to_file/file_name

To delete a folder from hdfs you can use below given command :

hadoop fs -rm -r -skipTrash /folder_name

You need to use -skipTrash option otherwise error will be prompted.

With Scala:

val fs:FileSystem = FileSystem.get(new URI(filePath), sc.hadoopConfiguration);
fs.delete(new Path(filePath), true) // true for recursive

sc is the SparkContext

To delete a file from hdfs use the command: hadoop fs -rm -r /FolderName

I contacted AWS support and it seemed that the problem was that the log files I was analyzing were very big and that created an issue with memory. I added to my pipeline definition "masterInstanceType" : "m1.xlarge" in the EMRCluster section and it worked.

From the command line:

 hadoop fs -rm -r /folder

I use hadoop 2.6.0, the commande line 'hadoop fs -rm -r fileName.hib' works fine for deleting any hib file on my hdfs file sys

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!