How to convert a find command to instead use grep to filter and then `exec` commands on output?

我是研究僧i 提交于 2019-12-23 04:36:52

问题


I have a scenario where I need to execute a series of commands on each file that's found. This normally would work great, except I have over 100 files and folders to exclude from find's results for execution. This becomes unwieldy and non-executable from the shell directly. It seems like it would be optimal to use an "exclusion file" similar to how tar or grep allows for such files.

Since find does not accept a file for exclusion, but grep does, I want to know: how can the following be converted to a command that would replace the exclusion (prune) and exec functions in find to instead utilize grep with an exclusion file (grep -v -f excludefile) to exclude the folders and files and then execute a series of commands on the result like the current command does it:

find $IN_PATH -regextype posix-extended \
  -regex "/(excluded1|excluded2|excluded3|...|excludedN)" -prune \
  -o -type f \
  -exec sh -c "( cmd -with_args 1 '{}'; cmd -args2 '{}'; cmd3 '{}') \
    | cmd4 | cmd5 | cmd6; cmd7 '{}'" \; \
  > output

As a side note (not critical), I've read that if you don't use exec this process becomes much less efficient and this process is already consuming over 100 minutes to execute each time that it's run, so I don't want to slow it down any more than is necessary.


回答1:


the best way i think of to fulfill your scenario , is split the one-liner to two line and introduce xargs with parallel .

find $IN_PATH -regextype posix-extended \
  -regex "/(excluded1|excluded2|excluded3|...|excludedN)" -prune \
  -o -type f  > /tmp/full_file_list
cat /tmp/full_file_list|grep -f excludefile |xargs -0 -n 1 -P <nr_procs> sh -c 'command here' >output

see Bash script processing limited number of commands in parallel and Doing parallel processing in bash? to learn more about parallel in bash

finding and command on files are facing disk-io conflicts in one liner , spilt the one-liner could speed up the process a little bit ,

hint: remember to put your full_file_list/excludefile/output in your exclude rules , and always debug your command on a smaller directory to reduce waiting time




回答2:


Why not simply:

find . -type f |
grep -v -f excludefile |
xargs whatever

With respect to this process is already consuming over 100 minutes to execute - that's almost certainly a problem with whatever command line you wrote to replace whatever above and we could probably help you improve that if you post a separate question.



来源:https://stackoverflow.com/questions/57747393/how-to-convert-a-find-command-to-instead-use-grep-to-filter-and-then-exec-comm

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!