Restarting Upstart instance processes

霸气de小男生 提交于 2019-11-29 20:21:58
SpamapS

In worker.conf you just need to change this line:

stop on shutdown

To:

stop on stopping my-workers

And change my-workers.conf to use pre-start instead of script:

pre-start script
  for i in `seq 1 $NUM_WORKERS`
  do
    start worker N=$i
  done
end script

Now my-workers will keep state: since the work happens in pre-start, the my-workers main process won't exist and so won't exit. stop on stopping my-workers causes the workers to stop whenever my-workers is stopped. Then of course when it starts up again it will start the workers again.

(FYI, stop on shutdown does nothing, as shutdown is not a system event. man upstart-events for all the defined events) so you should also change my-workers to stop on runlevel [06]

I tried it with the example from above and SpamapS answer, I received:

init: my-workers pre-start process (22955) terminated with status 127

In /var/log/upstart/my-workers.log I found the problem:

/proc/self/fd/9: 6: /proc/self/fd/9: end: not found

The end of the for-loop in my-workers.conf seemed to be wrong syntax. I replaced

script
  for i in `seq 1 $NUM_WORKERS`
    do
      start worker N=$i
    done
  end
end script

with

script
  for i in `seq 1 $NUM_WORKERS`
  do
    start worker N=$i
  done
end script

and it worked!

Consider adding to the worker.conf one more event:

stop on shutdown or workers-stop

Then you can call from the command line

sudo initctl emit workers-stop

You can add similar event to start workers. To achieve restarting all workers create a task that will emit workers-stop and then workers-start events.

Essentially you need to have a process that executes many stop and start commands for all your N=1, N=2 combination.

A simple way to do this is a couple of bash for loops inside an exec script stanza. However, if the processes take some time to stop (e.g. because they are working on something and they are accepting SIGTERM after having processed their current job) this is inefficient as you have to wait for one to stop before sending the signal to the next one.

Therefore, I built an Upstart script that stops them in parallel at https://github.com/elifesciences/builder-base-formula/blob/master/elife/config/etc-init-multiple-processes-parallel.conf

The script is compiled by Salt using as input a map of process names to how many are there. Here is a sample result:

description "(Re)starts all instances, in parallel"
# http://upstart.ubuntu.com/cookbook/#start-on
start on (local-filesystems and net-device-up IFACE!=lo)
task
script
    timeout=300
    echo "--------"

    echo "Current status of 5 elife-bot-worker processes"
    echo "Now is" $(date -Iseconds)
    for i in `seq 1 5`
    do
        status elife-bot-worker ID=$i || true
    done
    echo "Stopping asynchronously 5 elife-bot-worker processes"
    echo "Now is" $(date -Iseconds)
    for i in `seq 1 5`
    do
        (stop elife-bot-worker ID=$i &) || true
    done

    for i in `seq 1 5`
    do
        echo "Waiting for elife-bot-worker $i to stop"
        echo "Now is" $(date -Iseconds)
        counter=0
        while true
        do
            if [ "$counter" -gt "$timeout" ]
            then
                echo "It shouldn't take more than $timeout seconds to kill all the elife-bot-worker processes"
                exit 1
            fi
            status elife-bot-worker ID=$i 2>&1 | grep "Unknown instance" && break
            sleep 1
            counter=$((counter + 1))
        done
    done
    echo "Stopped all elife-bot-worker processes"

    echo "Starting 5 elife-bot-worker processes"
    for i in `seq 1 5`
    do
        start elife-bot-worker ID=$i
    done
    echo "Started 5 elife-bot-worker processes"

end script
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!