Numpy element-wise dot product

前端未结

关注

 2  935

is there an elegant, numpy way to apply the dot product elementwise? Or how can the below code be translated into a nicer version?

m0 # shape (5, 3, 2, 2)
m1


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  眼角桃花        
                
              
                            
                2020-11-30 08:45
              
            
            
                                                                       
I think numpy.inner() is what you really want?
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  不思量自难忘°        
                
              
                            
                2020-11-30 08:53
              
            
            
                                                                       
Approach #1

Use np.einsum -

np.einsum('ijkl,ilm->ijkm',m0,m1)


Steps involved :


Keep the first axes from the inputs aligned.
Lose the last axis from m0 against second one from m1 in sum-reduction.
Let remaining axes from m0 and m1 spread-out/expand with elementwise multiplications in an outer-product fashion.




Approach #2

If you are looking for performance and with the axis of sum-reduction having a smaller length, you are better off with one-loop and using matrix-multiplication with np.tensordot, like so -

s0,s1,s2,s3 = m0.shape
s4 = m1.shape[-1]
r = np.empty((s0,s1,s2,s4))
for i in range(s0):
    r[i] = np.tensordot(m0[i],m1[i],axes=([2],[0]))




Approach #3

Now, np.dot could be efficiently used on 2D inputs for some further performance boost. So, with it, the modified version, though a bit longer one, but hopefully the most performant one would be -

s0,s1,s2,s3 = m0.shape
s4 = m1.shape[-1]
m0.shape = s0,s1*s2,s3   # Get m0 as 3D for temporary usage
r = np.empty((s0,s1*s2,s4))
for i in range(s0):
    r[i] = m0[i].dot(m1[i])
r.shape = s0,s1,s2,s4
m0.shape = s0,s1,s2,s3  # Put m0 back to 4D


Runtime test

Function definitions -

def original_app(m0, m1):
    s0,s1,s2,s3 = m0.shape
    s4 = m1.shape[-1]
    r = np.empty((s0,s1,s2,s4))
    for i in range(s0):
        for j in range(s1):
            r[i, j] = np.dot(m0[i, j], m1[i])
    return r

def einsum_app(m0, m1):
    return np.einsum('ijkl,ilm->ijkm',m0,m1)

def tensordot_app(m0, m1):
    s0,s1,s2,s3 = m0.shape
    s4 = m1.shape[-1]
    r = np.empty((s0,s1,s2,s4))
    for i in range(s0):
        r[i] = np.tensordot(m0[i],m1[i],axes=([2],[0]))
    return r        

def dot_app(m0, m1):
    s0,s1,s2,s3 = m0.shape
    s4 = m1.shape[-1]
    m0.shape = s0,s1*s2,s3   # Get m0 as 3D for temporary usage
    r = np.empty((s0,s1*s2,s4))
    for i in range(s0):
        r[i] = m0[i].dot(m1[i])
    r.shape = s0,s1,s2,s4
    m0.shape = s0,s1,s2,s3  # Put m0 back to 4D
    return r


Timings and verification -

In [291]: # Inputs
     ...: m0 = np.random.rand(50,30,20,20)
     ...: m1 = np.random.rand(50,20,20)
     ...: 

In [292]: out1 = original_app(m0, m1)
     ...: out2 = einsum_app(m0, m1)
     ...: out3 = tensordot_app(m0, m1)
     ...: out4 = dot_app(m0, m1)
     ...: 
     ...: print np.allclose(out1, out2)
     ...: print np.allclose(out1, out3)
     ...: print np.allclose(out1, out4)
     ...: 
True
True
True

In [293]: %timeit original_app(m0, m1)
     ...: %timeit einsum_app(m0, m1)
     ...: %timeit tensordot_app(m0, m1)
     ...: %timeit dot_app(m0, m1)
     ...: 
100 loops, best of 3: 10.3 ms per loop
10 loops, best of 3: 31.3 ms per loop
100 loops, best of 3: 5.12 ms per loop
100 loops, best of 3: 4.06 ms per loop

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复