Multiprocessing with multiple arguments to function in Python 2.7

后端 未结 3 1400
Happy的楠姐
Happy的楠姐 2021-01-06 04:08

I\'m trying to implement multiprocessing to speed up a replication loop, but cannot get it to work in Python27. This is a very simplified version of my program, based on the

相关标签:
3条回答
  • 2021-01-06 05:02

    The problem is solved by adding a main() function as:

    import itertools
    from multiprocessing import Pool
    
    def func(g, h, i):
        return g + h + i
    
    def helper(args):
        args2 = args[0] + (args[1],)
        return func(*args2)
    
    def main():
        pool = Pool(processes=4)
        result = pool.map(helper,itertools.izip(itertools.repeat((2, 3)), range(10)))
        print result
    
    if __name__ == '__main__':
        main()
    

    Based on the answer from @ErikAllik I'm thinking that this might be a Windows-specific problem.

    edit: Here is a clear and informative tutorial on multiprocessing in python.

    0 讨论(0)
  • 2021-01-06 05:10

    On my OS X, with Python 2.7, your code outputs:

    [5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]
    

    I can see your Python paths contain EPD_python27, so maybe try using a vanila Python distribution, not Enthought Python Distribution.

    UPDATE: Please see @fileunderwater's answer for a solution; I've run into this once myself, but had totally forgotten about it :)

    Explanation: The problem happens (only on Windows for some reason, but could as well be happening on OS X and Linux) because your module contains top-level code. What multiprocessing does is that it imports your code in the subprocess and executes it. However, if your module contains top-level code, it will be evaluated/executed immediately as the module gets imported. Wrapping it in main and only calling main() conditionally (i.e. with a if __name__ == '__main__' block), you're preventing this from happening. Also, this is more correct on OS X and Linux, and is generally always preferred over putting code right in the module.

    0 讨论(0)
  • 2021-01-06 05:11

    There's a fork of multiprocessing called pathos (note: use the version on github) that doesn't need starmap or helpers or all of that other stuff -- the map functions mirror the API for python's map, thus map can take multiple arguments. With pathos, you can also generally do multiprocessing in the interpreter, instead of being stuck in the __main__ block. pathos is due for a release, after some mild updating -- mostly conversion to python 3.x.

      Python 2.7.5 (default, Sep 30 2013, 20:15:49) 
      [GCC 4.2.1 (Apple Inc. build 5566)] on darwin
      Type "help", "copyright", "credits" or "license" for more information.
      >>> from pathos.multiprocessing import ProcessingPool    
      >>> pool = ProcessingPool(nodes=4)
      >>>
      >>> def func(g,h,i):
      ...   return g+h+i
      ... 
      >>> p.map(func, [1,2,3],[4,5,6],[7,8,9])
      [12, 15, 18]
      >>>
      >>> # also can pickle stuff like lambdas 
      >>> result = pool.map(lambda x: x**2, range(10))
      >>> result
      [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
      >>>
      >>> # also does asynchronous map
      >>> result = pool.amap(pow, [1,2,3], [4,5,6])
      >>> result.get()
      [1, 32, 729]
      >>>
      >>> # or can return a map iterator
      >>> result = pool.imap(pow, [1,2,3], [4,5,6])
      >>> result
      <processing.pool.IMapIterator object at 0x110c2ffd0>
      >>> list(result)
      [1, 32, 729]
    
    0 讨论(0)
提交回复
热议问题