How to run a command (1000 times) that requires two different types of input files

北战南征 提交于 2020-07-03 05:24:39

问题


I have calculated directed modularity by means of DirectedLouvain (https://github.com/nicolasdugue/DirectedLouvain). I am now trying to test the significance of the values obtained, by means of a null model. To do it I need to run 1000 times one of the commands of DirectedLouvain over 1000 different input files.

Following @ KamilCuk recomendations I have used this code that takes the 1000 *.txt input files and generates 1000 *.bin files and 1000 *.weights files. It worked perfectly:

find -type f -name '*.txt' |
while IFS= read -r file; do
   file_no_extension=${file##*/};
   file_no_extension=${file_no_extension%%.*}
   ./convert -i "$file" -o "$file_no_extension".bin -w "$file_no_extension".weights
done

Now I am trying to use another command that takes these two types of files (*.bin and *.weights) and generates *.tree files. I have tried this with no success:

find ./ -type f \( -iname \*.bin -o -iname \*.weights \) | 
while IFS= read -r file; do
   file_no_extension=${file##*/};
   file_no_extension=${file_no_extension%%.*}
   ./community "$file.bin" -l -1 -w "$file.weights" > "$file_no_extension".tree
done

Any suggestion?


回答1:


  1. Find all files with that extension.
  2. For each file
    1. Extract the filename without exntesion
    2. Run the command

So:

find -type f -name '*.ext' |
while IFS= read -r file; do
   file_no_extension=${file##*/};
   file_no_extension=${file_no_extension%%.*}
   ./convert -i "$file" -o "$file_no_extension".bin -w "$file_no_extension".weights
done

// with find:
find -type f -name '*.ext' -exec sh -c 'f=$(basename "$1" .ext); ./convert -i "$1" -o "$f".bin -w "$f".weights' _ {} \;

// with xargs:
find -type f -name '*.ext' |
xargs -d '\n' -n1 sh -c 'f=$(basename "$1" .ext); ./convert -i "$1" -o "$f".bin -w "$f".weights' _



回答2:


You could use GNU Parallel to run your jobs in parallel across all your CPU cores like this:

parallel convert -i {} -o {.}.bin -w {.}.weights ::: input*.txt

Initially, you may like to do a "dry run" that shows what it would do without actually doing anything:

parallel --dry-run convert -i {} -o {.}.bin -w {.}.weights ::: input*.txt

If you get errors about the argument list being too long because you have too many files, you can feed their names in on stdin like this instead:

find . -name "input*txt" -print0 | parallel -0 convert -i {} -o {.}.bin -w {.}.weights



回答3:


You can use find to list your files and execute a command on all of them:

find -name '*.ext' -exec ./runThisExecutable '{}' \;

If you have a.ext and b.ext in a directory, this will run ./runThisExecutable a.ext and ./runThisExecutable b.ext.

To test whether it identifies the right files, you can run it without -exec so it only prints the filenames:

find -name '*.ext'
./a.ext
./b.ext


来源:https://stackoverflow.com/questions/62596795/how-to-run-a-command-1000-times-that-requires-two-different-types-of-input-fil

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!