Hadoop fs -rm with regular expression

[亡魂溺海] 提交于 2019-12-12 10:26:03

问题


I have a table with 9k partitions, of which I would like to delete about 1200 (which represents 3 days)
I would like to combine the hadoop fs -rm and regular expressions for these 3 days, something like pr_load_time=2017070([1-4])(\d+) .

The partitions look like this (I want to match only the first two here)

pr_load_time=20170701000317
pr_load_time=20170704133602
pr_load_time=20170705000317
pr_load_time=20170706133602

Is something like this possible? I was thinking about matching the partitions with awk and use xargs, but this seems to be really slow approach to delete such a big number of files.


回答1:


I guess above comment would solve your problem however you could try below in case

/hdfs path/pr_load_time={20170701000317,20170704133602,20170705000317,..}

or something like this

/hdfs path/pr_load_time=201707{01000317,04133602,05000317,..}

this can combine different pattern in single command

/hdfs path/pr_load_time=201707{01*,04*,05*,..}



来源:https://stackoverflow.com/questions/45536017/hadoop-fs-rm-with-regular-expression

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!