Python how to do multiprocessing inside of a class?

后端未结

关注

 3  731

I have a code structure that looks like this:

Class A:
  def __init__(self):
    processes = []
    for i in range(1000):
      p = Process(target=self.RunProces


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  旧巷少年郎        
                
              
                            
                2021-02-01 04:21
              
            
            
                                                                       
There are a couple of syntax issues that I can see in your code:


args in Process expects a tuple, you pass an integer, please change line 5 to:

p = Process(target=self.RunProcess, args=(i,))
list.append is a method and arguments passed to it should be enclosed in (), not [], please change line 6 to:

processes.append(p)


As @qarma points out, its not good practice to start the processes in the class constructor. I would structure the code as follows (adapting your example):

import multiprocessing as mp
from time import sleep

class A(object):
    def __init__(self, *args, **kwargs):
        # do other stuff
        pass

    def do_something(self, i):
        sleep(0.2)
        print('%s * %s = %s' % (i, i, i*i))

    def run(self):
        processes = []

        for i in range(1000):
            p = mp.Process(target=self.do_something, args=(i,))
            processes.append(p)

        [x.start() for x in processes]


if __name__ == '__main__':
    a = A()
    a.run()

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  梦如初夏        
                
              
                            
                2021-02-01 04:23
              
            
            
                                                                       
A practical work-around is to break down your class, e.g. like this:

class A:
    def __init__(self, ...):
        pass

    def compute(self):
        procs = [Process(self.run, ...) for ... in ...]
        [p.start() for p in procs]
        [p.join() for p in procs]

    def run(self, ...):
        pass

pool = A(...)
pool.compute()


When you fork a process inside __init__, the class instance self may not be fully initialised anyway, thus it's odd to ask a subprocess to execute self.run, although technically, yes, it's possible.

If it's not that, then it sounds like an instance of this issue:

http://bugs.python.org/issue11240
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  夕颜        
                
              
                            
                2021-02-01 04:29
              
            
            
                                                                       
It should simplify things for you to use a Pool.  As far as speed, starting up the processes does take time.  However, using a Pool as opposed to running njobs of Process should be as fast as you can get it to run with processes.  The default setting for a Pool (as used below) is to use the maximum number of processes available (i.e. the number of CPUs you have), and keep farming out new jobs to a worker as soon as a job completes.  You won't get njobs-way parallel, but you'll get as much parallelism that your CPUs can handle without oversubscribing your processors.  I'm using pathos, which has a fork of multiprocessing because it's a bit more robust than standard multiprocessing… and, well, I'm also the author.  But you could probably use multiprocessing for this.

>>> from pathos.multiprocessing import ProcessingPool as Pool
>>> class A(object):
...   def __init__(self, njobs=1000):
...     self.map = Pool().map
...     self.njobs = njobs
...     self.start()
...   def start(self):
...     self.result = self.map(self.RunProcess, range(self.njobs))
...     return self.result
...   def RunProcess(self, i):
...     return i*i
... 
>>> myA = A()
>>> myA.result[:11]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
>>> myA.njobs = 3
>>> myA.start()  
[0, 1, 4]


It's a bit of an odd design to start the Pool inside of __init__. But if you want to do that, you have to get results from something like self.result… and you can use self.start for subsequent calls.

Get pathos here: https://github.com/uqfoundation
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复