Handling Signals in an MPI Application / Gracefully exit

久未见 提交于 2019-12-02 01:16:10

If your goal is to stop all processes at the same point, then there is no way around always synchronizing at the possible termination points. That is, a collective call at the termination points is required.

Of course, you can try to avoid an extra broadcast by using the synchronization of another collective call to ensure proper termination, or piggy-pack the termination information on an existing broadcast, but I don't think that's worth it. After all, you only need to synchronize before I/O and at least once per ten minutes. At such a frequency, even a broadcast is not a performance problem.

Using signals in your MPI application in general is not safe. Some implementations may support it and others may not.

For instance, in MPICH, SIGUSR1 is used by the process manager for internal notification of abnormal failures.

http://lists.mpich.org/pipermail/discuss/2014-October/003242.html

Open MPI on the other had will forward SIGUSR1 and SIGUSR2 from mpiexec to the other processes.

http://www.open-mpi.org/doc/v1.6/man1/mpirun.1.php#sect14

Other implementations will differ. So before you go too far down this route, make sure that the implementation you're using can deal with it.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!