Optimized matrix multiplication in C

后端未结

关注

 13  2466

一整个雨季 2020-11-30 01:44

I\'m trying to compare different methods for matrix multiplication. The first one is normal method:

do
{
    for (j = 0; j < i; j++)
    {
        for (k


      
      
        
          13条回答        

        
                    
            
            
                         
                
              
              
                
                   我在风中等你
                                             
                
                
                (楼主)
            
              
              
                2020-11-30 02:27
              

            
            
                        
You should not write matrix multiplication.  You should depend on external libraries.  In particular you should use the GEMM routine from the BLAS library.  GEMM often provides the following optimizations

Blocking

Efficient Matrix Multiplication relies on blocking your matrix and performing several smaller blocked multiplies.  Ideally the size of each block is chosen to fit nicely into cache greatly improving performance.  

Tuning

The ideal block size depends on the underlying memory hierarchy (how big is the cache?).  As a result libraries should be tuned and compiled for each specific machine.  This is done, among others, by the ATLAS implementation of BLAS.

Assembly Level Optimization

Matrix multiplicaiton is so common that developers will optimize it by hand.  In particular this is done in GotoBLAS.

Heterogeneous(GPU) Computing

Matrix Multiply is very FLOP/compute intensive, making it an ideal candidate to be run on GPUs.  cuBLAS and MAGMA are good candidates for this.  

In short, dense linear algebra is a well studied topic.  People devote their lives to the improvement of these algorithms.  You should use their work; it will make them happy.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它13个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复