How to run functions in parallel?

后端 未结 6 1090
我在风中等你
我在风中等你 2020-11-22 10:32

I researched first and couldn\'t find an answer to my question. I am trying to run multiple functions in parallel in Python.

I have something like this:



        
6条回答
  •  时光取名叫无心
    2020-11-22 10:56

    This can be done elegantly with Ray, a system that allows you to easily parallelize and distribute your Python code.

    To parallelize your example, you'd need to define your functions with the @ray.remote decorator, and then invoke them with .remote.

    import ray
    
    ray.init()
    
    dir1 = 'C:\\folder1'
    dir2 = 'C:\\folder2'
    filename = 'test.txt'
    addFiles = [25, 5, 15, 35, 45, 25, 5, 15, 35, 45]
    
    # Define the functions. 
    # You need to pass every global variable used by the function as an argument.
    # This is needed because each remote function runs in a different process,
    # and thus it does not have access to the global variables defined in 
    # the current process.
    @ray.remote
    def func1(filename, addFiles, dir):
        # func1() code here...
    
    @ray.remote
    def func2(filename, addFiles, dir):
        # func2() code here...
    
    # Start two tasks in the background and wait for them to finish.
    ray.get([func1.remote(filename, addFiles, dir1), func2.remote(filename, addFiles, dir2)]) 
    

    If you pass the same argument to both functions and the argument is large, a more efficient way to do this is using ray.put(). This avoids the large argument to be serialized twice and to create two memory copies of it:

    largeData_id = ray.put(largeData)
    
    ray.get([func1(largeData_id), func2(largeData_id)])
    

    Important - If func1() and func2() return results, you need to rewrite the code as follows:

    ret_id1 = func1.remote(filename, addFiles, dir1)
    ret_id2 = func2.remote(filename, addFiles, dir2)
    ret1, ret2 = ray.get([ret_id1, ret_id2])
    

    There are a number of advantages of using Ray over the multiprocessing module. In particular, the same code will run on a single machine as well as on a cluster of machines. For more advantages of Ray see this related post.

提交回复
热议问题