问题
I'm currently reading Programming Erlang! , at the end of Chapter 13, we want to create a keep-alive process, the example likes:
on_exit(Pid, Fun) ->
spawn(fun() ->
Ref = monitor(process, Pid),
receive
{'DOWN', Ref, process, Pid, Info} ->
Fun(Info)
end
end).
keep_alive(Name, Fun) ->
register(Name, Pid = spawn(Fun)),
on_exit(Pid, fun(_Why) -> keep_alive(Name, Fun) end).
but when between register/2 and on_exit/2 the process maybe exit, so the monitor will failed, I changed the keep_alive/2 like this:
keep_alive(Name, Fun) ->
{Pid, Ref} = spawn_monitor(Fun),
register(Name, Pid),
receive
{'DOWN', Ref, process, Pid, _Info} ->
keep_alive(Name, Fun)
end.
There also an bug, between spawn_monitor/2 and register/2, the process maybe exit. How could this come to run successfully? Thanks.
回答1:
I'm not sure that you have a problem that needs solving. Monitor/2 will succeed even if your process exits after register/2. Monitor/2 will send a 'DOWN' message whose Info component will be noproc. Per the documentation:
A 'DOWN' message will be sent to the monitoring process if Item dies, if Item does not exist, or if the connection is lost to the node which Item resides on. (see http://www.erlang.org/doc/man/erlang.html#monitor-2).
So, in your original code
- register assocates Name to the Pid
- Pid dies
- on_exit is called and monitor/2 is executed
- monitor immediately sends a 'DOWN' message which is received by the function spawned by on_exit
- the Fun(Info) of the received statement is executed calling keep_alive/2
I think all is good.
回答2:
So why you did't want to use erlang supervisor behaviour? it's provides useful functions for creating and restarting keep-alive processes.
See here the example: http://www.erlang.org/doc/design_principles/sup_princ.html
回答3:
In your second example, if process exits before registration register will fail with badarg. The easiest way to get around that would be surrounding register with try ... catch and handle error in catch.
You can even leave catch empty, because even if registration failed, the 'DOWN' message, will be sent.
On the other hand, I wouldn't do that in production system. If your worker fails so fast, it is very likely, that the problem is in its initialisation code and I would like to know, that it failed to register and stopped the system. Otherwise, it could fail and be respawned in an endless loop.
来源:https://stackoverflow.com/questions/25617769/how-to-create-a-keep-alive-process-in-erlang