How can a supervisor that reached_max_restart_intensity only delete the offending child?

流过昼夜 提交于 2019-12-24 02:59:13

问题


I have a one_for_one supervisor that handles similar and totally independent children.

When there is a problem with one child, repeatedly crashing and triggering:

=SUPERVISOR REPORT==== 30-Mar-2011::13:10:42 ===
     Supervisor: {local,gateway_sup}
     Context:    shutdown
     Reason:     reached_max_restart_intensity
     Offender:   [{pid,<0.76.0>}, ...

shutting itself down and also terminating all the innocent children that would just continue to run fine otherwise.

How can I build a supervision tree out of standard Erlang supervisors that only stops to restart the one offending child and leaves the others alone?

I was thinking about having a extra supervisor with just one single child but this seems to heavyweight to me.

Any other ways to handle this?


回答1:


I think the best solution would be to have two layers of supervision.

One supervisor which starts a supervisor + process pair for each gen_server you want running. This supervisor is configured with one_for_one strategy and temporary children.

Each supervisor running under this supervisor would have correctly configured MaxR and MaxT values, that will trigger a crash of that supervisor once the child misbehaves.

When the lower level supervisor crashes, the top level supervisor "just doesn't care".

A supervisor consumes 233 bytes when started with one child (total heap size) so memory consumption should not be an issue.

The supervision tree should look like:

supervisor_top
    |
    |
    +------------------------+-----    ...
    |                        |
 supervisor_1               supervisor_2
 restart temporary          restart temporary
    |                         |
  gen_server_1              gen_server_2
  restart transient         restart transient


来源:https://stackoverflow.com/questions/5485736/how-can-a-supervisor-that-reached-max-restart-intensity-only-delete-the-offendin

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!