How to Consume an mpi4py application from a serial python script

旧城冷巷雨未停 提交于 2019-12-01 20:26:48

This is indeed possible and is in the documentation of mpi4py in the section Dynamic Process Management. What you need is the so called Spawn functionality which is not available with MSMPI (in case you are working with Windows) see also Spawn not implemented in MSMPI.

Example

The first file provides a kind of wrapper to your function to hide all the MPI stuff, which I guess is your intention. Internally it calls the "actual" script containing your parallel code in 4 newly spawned processes.

Finally, you can open a python terminal and call:

from my_prog import parallel_fun

parallel_fun()
# Hi from 0/4
# Hi from 3/4
# Hi from 1/4
# Hi from 2/4
# We got the magic number 6

my_prog.py

import sys
import numpy as np
from mpi4py import MPI

    def parallel_fun():
        comm = MPI.COMM_SELF.Spawn(
            sys.executable,
            args = ['child.py'],
            maxprocs=4)

        N = np.array(0, dtype='i')

        comm.Reduce(None, [N, MPI.INT], op=MPI.SUM, root=MPI.ROOT)

        print(f'We got the magic number {N}')

Here the child file with the parallel code:

child.py

from mpi4py import MPI
import numpy as np


comm = MPI.Comm.Get_parent()

print(f'Hi from {comm.Get_rank()}/{comm.Get_size()}')
N = np.array(comm.Get_rank(), dtype='i')

comm.Reduce([N, MPI.INT], None, op=MPI.SUM, root=0)
Mark

Unfortunately I don't think this is possible as you have to run the MPI code specifically with mpirun.

The best you can do is the opposite where you write generic chunks of code which can be called either by an MPI process or a normal python process.

The only other solution is to wrapper the whole MPI part of your code into an external call and call it with subprocess in your non MPI code, however this will be tied to your system configuration quite heavily, and is not really that portable.

Subprocess is detailed in this thread Using python with subprocess Popen, and is worth a look, the complexity here is making the correct call in the first place i.e

command = "/your/instance/of/mpirun /your/instance/of/python your_script.py -arguments"

And then getting the result back into your single threaded code, which dependent on size there are many ways, but something like parallel hdf5 would be a good place to look if you have to pass back big array data.

Sorry I cant give you an easy solution.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!