问题
I have a process that i would like to kill and then restart a service. Someone has written a code to kill the process by writing the following set of scripts
ps -ef |grep "process_name" | awk '{print "kill -15 " $2}'> /projects/test/kill.sh
#run the kill script
/projects/test/kill.sh
and then again
ps -ef |grep "process_name" | awk '{print "kill -9 " $2}'> /projects/test/kill.sh
#run the kill script
/projects/test/kill.sh
#finally
service restart command here
# the problem here is that service does not restart properly sometimes,
as it thinks that process is still running.
As i understand kill -15 gracefully kills the process. But then right away they have the kill -9 as well. So if a process was getting killed in the first command, what happens when kill -9 is also run on the same process? Or will the ps -ef even list out that process since it has been marked for kill?
Thanks!
回答1:
You are correct that kill -15
is to gracefully kill a process. But, killing a process is something that happens instantaneously. So the program above is going to check for pid
, attempting to kill it gracefully .. If the kill -15
fails -- The kill -9
is performed. The way it knows that kill -15
failed, is the grep
command. If kill -15
was successful, that pid
should not exist any longer, making the following grep
return empty.
So really, kill -9
only runs if kill -15
failed to gracefully stop the program. The problem with this approach, is that sometimes gracefully stopping a process can take some time depending on the program. So IMHO there needs to be a wait period or a sleep
for a few seconds to allow kill -15
to attempt to gracefully stop the process .. Most assuredly with the approach above, kill -9
is almost always invoked since the script doesn't allow much time for the process to be shut down properly. In the event that kill -15
is still processing, kill -9
will just override and instantly stop the process.
回答2:
If you have the option to refactor, you can use /proc/$PID
as a more efficient way to detect if a process is running.
stopSvc() { local svc=$1
read x pid x < <( ps -fu "$App_user" | grep -E " ($App_baseDIR/$1/|)$svc.jar$" ||: )
local -i starting="$(date +%s)" # linux epoch timestamp in seconds
while [[ -d "/proc/$pid" ]]
do ps -fp "$pid"
kill -term "$pid"
if (( ( $(date +%s) - starting ) < 20 )) # been trying for less than 20s
then sleep 2
date
else echo "$svc is hung - using a hard stop"
kill -KILL "$pid"
break
fi
done
sleep 2
[[ -d "/proc/$pid" ]] && return 1 || return 0 # flip the return
}
Basically, the kill -15
is a term
signal, which the process could catch to trigger a graceful shutdown, closing pipes, sockets, & files, cleaning up temp space, etc, so to be useful it should give some time. The -9
is a kill
and can't be caught. It's the Big Hammer that you use to squish the jobs that are misbehaving, and should be reserved for those cases.
You are totally right, this makes little sense. If you're going to use the -9
so soon, might as well skip the careless attempt at better practice and just remove the -15
.
来源:https://stackoverflow.com/questions/55796667/linux-kill-command-9-vs-15