Shell Script for multithreading a process

后端 未结 4 1644
失恋的感觉
失恋的感觉 2020-12-22 04:07

I am a Bioinformatician and recently stuck in a problem which requires some scripting to speed up my process. We have a software called PHASE and Command that i type in my c

4条回答
  •  再見小時候
    2020-12-22 05:00

    If you have GNU xargs, consider something like:

    printf '%s\0' *.inp | xargs -0 -P 4 -n 1 \
      sh -c 'for f; do ./PHASE "$f" "${f%.inp}.out"' _
    

    The -P 4 is important here, indicating the number of processes to run in parallel.

    If you have a very large number of inputs and they're fast to process, consider replacing -n 1 with a larger number, to increase the number of inputs each shell instance iterates over -- decreasing shell startup costs, but also reducing granularity and, potentially, level of parallelism.


    That said, if you really want to do batches of four (per your question), letting all four finish before starting the next four (which introduces some inefficiency, but is what you asked for), you could do something like this...

    set -- *.inp                # set $@ to list of files matching *.imp
    while (( $# )); do          # until we exhaust that list...
      for ((i=0; i<4; i++)); do # loop over batches of four...
        # as long as there's a next argument, start a process for it, and take it off the list
        [[ $1 ]] && ./PHASE "$1" "${1%.imp}.out" & shift
      done
      wait                      # ...and wait for running processes to finish before proceeding
    done
    

提交回复
热议问题