Fastest Implementation of Exponential Function Using AVX

前端未结
关注
 4  1178
忘掉有多难 2020-11-29 08:42
I\'m looking for an efficient (Fast) approximation of the exponential function operating on AVX elements (Single Precision Floating Point). Namely - __m256 _mm256_exp_

      
      
        
          4条回答        

        
                    
            
            
                         
                
              
              
                
                   野趣味
                                             
                
                
                (楼主)
            
              
              
                2020-11-29 09:25
              

            
            
                        
You can approximate the exponent yourself with Taylor series:

exp(z) = 1 + z + pow(z,2)/2 + pow(z,3)/6 + pow(z,4)/24 + ...


For that you need only addition and multiplication operations from AVX. Coefficients like 1/2, 1/6, 1/24 etc. are faster if hard-coded and then multiplied by rather than divided.

Take as many members of the sequence as required by your precision. Note that you will get relative error: for small z it may be 1e-6 in the absolute, but for large z it will be more than 1e-6 in the absolute, still abs(E-E1)/abs(E) - 1 is smaller than 1e-6 (where E is the precise exponent and E1 is what you get with approximation).

UPDATE: As @Peter Cordes has mentioned in a comment, precision can be improved by separating exponentiation of integer and fractional parts, handling the integer part by manipulating the exponent field of the binary float representation (which is based on 2^x, not e^x).  Then your Taylor series only has to minimize error over a small range.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它4个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复