Bash: wait with timeout

前端 未结 6 1332
再見小時候
再見小時候 2020-12-02 16:39

In a Bash script, I would like to do something like:

app1 &
pidApp1=$!
app2 &
pidApp2=$1

timeout 60 wait $pidApp1 $pidApp2
kill -9 $pidApp1 $pidApp2         


        
相关标签:
6条回答
  • 2020-12-02 17:07

    Both your example and the accepted answer are overly complicated, why do you not only use timeout since that is exactly its use case? The timeout command even has an inbuilt option (-k) to send SIGKILL after sending the initial signal to terminate the command (SIGTERM by default) if the command is still running after sending the initial signal (see man timeout).

    If the script doesn't necessarily require to wait and resume control flow after waiting it's simply a matter of

    timeout -k 60s 60s app1 &
    timeout -k 60s 60s app2 &
    # [...]
    

    If it does, however, that's just as easy by saving the timeout PIDs instead:

    pids=()
    timeout -k 60s 60s app1 &
    pids+=($!)
    timeout -k 60s 60s app2 &
    pids+=($!)
    wait "${pids[@]}"
    # [...]
    

    E.g.

    $ cat t.sh
    #!/bin/bash
    
    echo "$(date +%H:%M:%S): start"
    pids=()
    timeout 10 bash -c 'sleep 5; echo "$(date +%H:%M:%S): job 1 terminated successfully"' &
    pids+=($!)
    timeout 2 bash -c 'sleep 5; echo "$(date +%H:%M:%S): job 2 terminated successfully"' &
    pids+=($!)
    wait "${pids[@]}"
    echo "$(date +%H:%M:%S): done waiting. both jobs terminated on their own or via timeout; resuming script"
    

    .

    $ ./t.sh
    08:59:42: start
    08:59:47: job 1 terminated successfully
    08:59:47: done waiting. both jobs terminated on their own or via timeout; resuming script
    
    0 讨论(0)
  • 2020-12-02 17:17

    Write the PIDs to files and start the apps like this:

    pidFile=...
    ( app ; rm $pidFile ; ) &
    pid=$!
    echo $pid > $pidFile
    ( sleep 60 ; if [[ -e $pidFile ]]; then killChildrenOf $pid ; fi ; ) &
    killerPid=$!
    
    wait $pid
    kill $killerPid
    

    That would create another process that sleeps for the timeout and kills the process if it hasn't completed so far.

    If the process completes faster, the PID file is deleted and the killer process is terminated.

    killChildrenOf is a script that fetches all processes and kills all children of a certain PID. See the answers of this question for different ways to implement this functionality: Best way to kill all child processes

    If you want to step outside of BASH, you could write PIDs and timeouts into a directory and watch that directory. Every minute or so, read the entries and check which processes are still around and whether they have timed out.

    EDIT If you want to know whether the process has died successfully, you can use kill -0 $pid

    EDIT2 Or you can try process groups. kevinarpe said: To get PGID for a PID(146322):

    ps -fjww -p 146322 | tail -n 1 | awk '{ print $4 }'
    

    In my case: 145974. Then PGID can be used with a special option of kill to terminate all processes in a group: kill -- -145974

    0 讨论(0)
  • 2020-12-02 17:18

    To put in my 2c, we can boild down Teixeira's solution to:

    try_wait() {
        # Usage: [PID]...
        for ((i = 0; i < $#; i += 1)); do
            kill -0 $@ && sleep 0.001 || return 0
        done
        return 1 # timeout or no PIDs
    } &>/dev/null
    

    Bash's sleep accepts fractional seconds, and 0.001s = 1 ms = 1 KHz = plenty of time. However, UNIX has no loopholes when it comes to files and processes. try_wait accomplishes very little.

    $ cat &
    [1] 16574
    $ try_wait %1 && echo 'exited' || echo 'timeout'
    timeout
    $ kill %1
    $ try_wait %1 && echo 'exited' || echo 'timeout'
    exited
    

    We have to answer some hard questions to get further.

    Why has wait no timeout parameter? Maybe because the timeout, kill -0, wait and wait -n commands can tell the machine more precisely what we want.

    Why is wait builtin to Bash in the first place, so that timeout wait PID is not working? Maybe only so Bash can implement proper signal handling.

    Consider:

    $ timeout 30s cat &
    [1] 6680
    $ jobs
    [1]+    Running   timeout 30s cat &
    $ kill -0 %1 && echo 'running'
    running
    $ # now meditate a bit and then...
    $ kill -0 %1 && echo 'running' || echo 'vanished'
    bash: kill: (NNN) - No such process
    vanished
    

    Whether in the material world or in machines, as we require some ground on which to run, we require some ground on which to wait too.

    • When kill fails you hardly know why. Unless you wrote the process, or its manual names the circumstances, there is no way to determine a reasonable timeout value.

    • When you have written the process, you can implement a proper TERM handler or even respond to "Auf Wiedersehen!" send to it through a named pipe. Then you have some ground even for a spell like try_wait :-)

    0 讨论(0)
  • 2020-12-02 17:27
    app1 &
    app2 &
    sleep 60 &
    
    wait -n
    
    0 讨论(0)
  • 2020-12-02 17:28

    Here's a simplified version of Aaron Digulla's answer, which uses the kill -0 trick that Aaron Digulla leaves in a comment:

    app &
    pidApp=$!
    ( sleep 60 ; echo 'timeout'; kill $pidApp ) &
    killerPid=$!
    
    wait $pidApp
    kill -0 $killerPid && kill $killerPid
    

    In my case, I wanted to be both set -e -x safe and return the status code, so I used:

    set -e -x
    app &
    pidApp=$!
    ( sleep 45 ; echo 'timeout'; kill $pidApp ) &
    killerPid=$!
    
    wait $pidApp
    status=$?
    (kill -0 $killerPid && kill $killerPid) || true
    
    exit $status
    

    An exit status of 143 indicates SIGTERM, almost certainly from our timeout.

    0 讨论(0)
  • 2020-12-02 17:31

    I wrote a bash function that will wait until PIDs finished or until timeout, that return non zero if timeout exceeded and print all the PIDs not finisheds.

    function wait_timeout {
      local limit=${@:1:1}
      local pids=${@:2}
      local count=0
      while true
      do
        local have_to_wait=false
        for pid in ${pids}; do
          if kill -0 ${pid} &>/dev/null; then
            have_to_wait=true
          else
            pids=`echo ${pids} | sed -e "s/${pid}//g"`
          fi
        done
        if ${have_to_wait} && (( $count < $limit )); then
          count=$(( count + 1 ))
          sleep 1
        else
          echo ${pids}
          return 1
        fi
      done   
      return 0
    }
    

    To use this is just wait_timeout $timeout $PID1 $PID2 ...

    0 讨论(0)
提交回复
热议问题