implement 64-bit arithmetic on a 32-bit machine

前端未结

关注

 3  1359

The following code computes the product of x and y and stores the result in memory. Data type ll_t is defined to be equivalent to long long.

typedef long lon


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  难免孤独        
                
              
                            
                2020-12-29 17:55
              
            
            
                                                                       
I've converted it to intel syntax.

mov esi, y_low
mov eax, x
mov edx, eax
sar edx, 31
mov ecx, y_high

imul ecx, eax ; ecx = y_high *{signed} x

mov ebx, edx

imul ebx, esi ; ebx = sign_extension(x) *{signed} y_low

add ecx, ebx ; ecx = y_high *{signed} x_low + x_high *{signed} y_low

mul esi ; edx:eax = x_low *{unsigned} y_low

lea edx, [ecx + edx] ; edx = high(x_low *{unsigned} y_low + y_high *{signed} x_low + x_high *{signed} y_low)

mov ecx, dest
mov [ecx], eax
mov [ecx + 4], edx


What the above code does is multiplication of 2 64-bit signed integers that keeps the least-significant 64 bits of the product.

Where does the other 64-bit multiplicand come from? It's x sign-extended from 32 bits to 64. The sar instruction is used to replicate x's sign bit into all bits of edx. I call this value consisting only of the x's sign x_high. x_low is the value of x actually passed into the routine.

y_low and y_high are the least and most significant parts of y, just like x's x_low and x_high are.

From here it's pretty easy:

product = y *{signed} x =

(y_high * 2³² + y_low) *{signed} (x_high * 2³² + x_low) =

y_high *{signed} x_high * 2⁶⁴ +

y_high *{signed} x_low * 2³² +

y_low *{signed} x_high * 2³² +

y_low *{signed} x_low

y_high *{signed} x_high * 2⁶⁴ isn't calculated because it doesn't contribute to the least significant 64 bits of the product. We'd calculate it if we were interested in the full 128-bit product (full 96-bit product for the picky).

y_low *{signed} x_low is calculated using unsigned multiplication. It's legal to do so because 2's complement signed multiplication gives the same least significant bits as unsigned multiplication. Example:

-1 *{signed} -1 = 1

0xFFFFFFFFFFFFFFFF *{unsigned} 0xFFFFFFFFFFFFFFFF = 0xFFFFFFFFFFFFFFFE0000000000000001 (64 least significant bits are equivalent to 1)
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  悲哀的现实        
                
              
                            
                2020-12-29 17:56
              
            
            
                                                                       
Consider the context of line 8 and 9. 

By this time, ESI contains the lower half of y and EBX contains sgn(x). So line 8 is just computing sgn(x) * (y % 2^32) and storing it in EBX.

Line 9 draws upon that result. By the time Line 9 happens, ECX contains a partial upper half of the multiplication, that is, x * (y >> 32) signed. So EBX+ECX ends up being what we computed in the last step plus the partial upper half we found on a previous line.

The full algorithm itself is pretty neat ;)

EDIT: In response to a comment below...

Line 4: Consider what SAR EDX, 31 (or if you like, sar $31, %edx) really means. Since EDX is a 32-bit register, you'll end up with one of two values. Which two? Consider what they mean in the context of signed arithmetic.

Line 7: EDX by this point contains something pretty useful for the following operations. I'm just moving it where it needs to go.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  清歌不尽        
                
              
                            
                2020-12-29 18:02
              
            
            
                                                                       
What imul does is multiplies the contents of eax with ecx and saves the lower 32 bits in eax and higher 32 bits in edx.

addl as far as I remember adds the two registers and saves it on the first one so in this case ebx. (I am not sure if it does anything else and the l after addl stands for long)
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复