Why are multiprocessing.sharedctypes assignments so slow?

前端未结

关注

 3  1994

Here\'s a little bench-marking code to illustrate my question:

import numpy as np
import multiprocessing as mp
# allocate memory
%time temp = mp.RawArray(np.


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  栀梦        
                
              
                            
                2020-12-16 03:54
              
            
            
                                                                       
This is slow for the reasons given in your second link, and the solution is actually pretty simple: Bypass the (slow) RawArray slice assignment code, which in this case is inefficiently reading one raw C value at a time from the source array to create a Python object, then converts it straight back to raw C for storage in the shared array, then discards the temporary Python object, and repeats 1e8 times.

But you don't need to do it that way; like most C level things, RawArray implements the buffer protocol, which means you can convert it to a memoryview, a view of the underlying raw memory that implements most operations in C-like ways, using raw memory operations if possible. So instead of doing:

# assign memory, very slow
%time temp[:] = np.arange(1e8, dtype = np.uint16)
Wall time: 9.75 s  # Updated to what my machine took, for valid comparison


use memoryview to manipulate it as a raw bytes-like object and assign that way (np.arange already implements the buffer protocol, and memoryview's slice assignment operator seamlessly uses it):

# C-like memcpy effectively, very fast
%time memoryview(temp)[:] = np.arange(1e8, dtype = np.uint16)
Wall time: 74.4 ms  # Takes 0.76% of original time!!!


Note, the time for the latter is milliseconds, not seconds; copying using memoryview wrapping to perform raw memory transfers takes less than 1% of the time to do it the plodding way RawArray does it by default!
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  北荒        
                
              
                            
                2020-12-16 04:06
              
            
            
                                                                       
Just put a numpy array around the shared array:

import numpy as np
import multiprocessing as mp

sh = mp.RawArray('i', int(1e8))
x = np.arange(1e8, dtype=np.int32)
sh_np = np.ctypeslib.as_array(sh)


then time:

%time sh[:] = x
CPU times: user 10.1 s, sys: 132 ms, total: 10.3 s
Wall time: 10.2 s

%time memoryview(sh).cast('B').cast('i')[:] = x
CPU times: user 64 ms, sys: 132 ms, total: 196 ms
Wall time: 196 ms

%time sh_np[:] = x
CPU times: user 92 ms, sys: 104 ms, total: 196 ms
Wall time: 196 ms


No need to figure out how to cast the memoryview (as I had to in python3 Ubuntu 16) and mess with reshaping (if x has more dimensions, since cast() flattens). And use sh_np.dtype.name to double check data types just like any numpy array.  :)
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  不要未来只要你来        
                
              
                            
                2020-12-16 04:11
              
            
            
                                                                       
On ms-windows when you create a Process, a new Python interpreter will be spawned which then imports your program as a module. (This is why on ms-windows you should only create Process and Pool from within a if __name__ is "__main__" block.) This will recreate your array, which should take about the same time as creating it originally did. See the programming guidelines, especially concerning the spawn start method which has to be used on ms-windows.

So probably a better way is to create a memory mapped numpy array using numpy.memmap. Write the array to disk in the parent process. (On ms-windows this must be done in the if __name__ is "__main__" block, so it's only called once). Then in the target function use numpy.memmap in read-only mode to read the data.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复