Why is a ternary operator with two constants faster than one with a variable?

后端未结
关注
 3  450
清酒与你 2020-12-31 10:56
In Java, I have two different statements which accomplish the same result through using ternary operators, which are as follows:
num < 0 ? 0 : num;

      
      
        
          3条回答        

        
                    
            
            
                         
                
              
              
                
                   梦谈多话
                                             
                
                
                (楼主)
            
              
              
                2020-12-31 11:07
              

            
            
                        
First, let's rewrite the benchmark with JMH to avoid common benchmarking pitfalls.
public class FloatCompare {

    @Benchmark
    public float cmp() {
        float num = ThreadLocalRandom.current().nextFloat() * 2 - 1;
        return num < 0 ? 0 : num;
    }

    @Benchmark
    public float mul() {
        float num = ThreadLocalRandom.current().nextFloat() * 2 - 1;
        return num * (num < 0 ? 0 : 1);
    }
}

JMH also suggests that the multiplication code is a way faster:
Benchmark         Mode  Cnt   Score   Error  Units
FloatCompare.cmp  avgt    5  12,940 ± 0,166  ns/op
FloatCompare.mul  avgt    5   6,182 ± 0,101  ns/op

Now it's time to engage perfasm profiler (built into JMH) to see the assembly produced by JIT compiler. Here are the most important parts of the output (comments are mine):
cmp method:
  5,65%  │││  0x0000000002e717d0: vxorps  xmm1,xmm1,xmm1  ; xmm1 := 0
  0,28%  │││  0x0000000002e717d4: vucomiss xmm1,xmm0      ; compare num < 0 ?
  4,25%  │╰│  0x0000000002e717d8: jbe     2e71720h        ; jump if num >= 0
  9,77%  │ ╰  0x0000000002e717de: jmp     2e71711h        ; jump if num < 0

mul method:
  1,59%  ││  0x000000000321f90c: vxorps  xmm1,xmm1,xmm1    ; xmm1 := 0
  3,80%  ││  0x000000000321f910: mov     r11d,1h           ; r11d := 1
         ││  0x000000000321f916: xor     r8d,r8d           ; r8d := 0
         ││  0x000000000321f919: vucomiss xmm1,xmm0        ; compare num < 0 ?
  2,23%  ││  0x000000000321f91d: cmovnbe r11d,r8d          ; r11d := r8d if num < 0
  5,06%  ││  0x000000000321f921: vcvtsi2ss xmm1,xmm1,r11d  ; xmm1 := (float) r11d
  7,04%  ││  0x000000000321f926: vmulss  xmm0,xmm1,xmm0    ; multiply

The key difference is that there's no jump instructions in the mul method. Instead, conditional move instruction cmovnbe is used.
cmov works with integer registers. Since (num < 0 ? 0 : 1) expression uses integer constants on the right side, JIT is smart enough to emit a conditional move instead of a conditional jump.
In this benchmark, conditional jump is very inefficient, since branch prediction often fails due to random nature of numbers. That's why the branchless code of mul method appears faster.
If we modify the benchmark in a way that one branch prevails over another, e.g by replacing
ThreadLocalRandom.current().nextFloat() * 2 - 1

with
ThreadLocalRandom.current().nextFloat() * 2 - 0.1f

then the branch prediction will work better, and cmp method will become as fast as mul:
Benchmark         Mode  Cnt  Score   Error  Units
FloatCompare.cmp  avgt    5  5,793 ± 0,045  ns/op
FloatCompare.mul  avgt    5  5,764 ± 0,048  ns/op

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它3个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复