How can I improve performance via a high-level approach when implementing long equations in C++

后端未结
关注
 10  1911
孤街浪徒 2021-01-30 19:34
I am developing some engineering simulations. This involves implementing some long equations such as this equation to calculate stress in a rubber like material:

      
      
        
          10条回答        

        
                    
            
            
                         
                
              
              
                
                   野性不改
                                             
                
                
                (楼主)
            
              
              
                2021-01-30 19:52
              

            
            
                        
David Hammen's answer is good, but still far from optimal. Let's continue with his last expression (at the time of writing this)

auto l123 = l1 * l2 * l3;
auto cbrt_l123 = cbrt(l123);
T = mu/(3.0*l123)*(  pow(l1/cbrt_l123,a)*(2.0*N1-N2-N3)
                   + pow(l2/cbrt_l123,a)*(2.0*N2-N3-N1)
                   + pow(l3/cbrt_l123,a)*(2.0*N3-N1-N2))
  + K*(l123-1.0)*(N1+N2+N3);


which can be optimised further. In particular, we can avoid the call to cbrt() and one of the calls to pow() if exploiting some mathematical identities. Let's do this again step by step.

// step 1 eliminate cbrt() by taking the exponent into pow()
auto l123 = l1 * l2 * l3;
auto athird = 0.33333333333333333 * a; // avoid division
T = mu/(3.0*l123)*(  (N1+N1-N2-N3)*pow(l1*l1/(l2*l3),athird)
                   + (N2+N2-N3-N1)*pow(l2*l2/(l1*l3),athird)
                   + (N3+N3-N1-N2)*pow(l3*l3/(l1*l2),athird))
  + K*(l123-1.0)*(N1+N2+N3);


Note that I have also optimised 2.0*N1 to N1+N1 etc. Next, we can do with only two calls to pow(). 

// step 2  eliminate one call to pow
auto l123 = l1 * l2 * l3;
auto athird = 0.33333333333333333 * a;
auto pow_l1l2_athird = pow(l1/l2,athird);
auto pow_l1l3_athird = pow(l1/l3,athird);
auto pow_l2l3_athird = pow_l1l3_athird/pow_l1l2_athird;
T = mu/(3.0*l123)*(  (N1+N1-N2-N3)* pow_l1l2_athird*pow_l1l3_athird
                   + (N2+N2-N3-N1)* pow_l2l3_athird/pow_l1l2_athird
                   + (N3+N3-N1-N2)/(pow_l1l3_athird*pow_l2l3_athird))
  + K*(l123-1.0)*(N1+N2+N3);


Since the calls to pow() are by far the most costly operation here, it is worth to reduce them as far as possible (the next costly operation was the call to cbrt(), which we eliminated). 

If by any chance a is integer, the calls to pow could be optimized to calls to cbrt (plus integer powers), or if athird is half-integer, we can use sqrt (plus integer powers). Furthermore, if by any chance l1==l2 or l1==l3 or l2==l3 one or both calls to pow can be eliminated. So, it's worth to consider these as special cases if such chances realistically exist.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它10个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复