How could I implement logical implication with bitwise or other efficient code in C?

前端未结

关注

 5  2569

I want to implement a logical operation that works as efficient as possible. I need this truth table:

p    q    p → q
T    T      T
T    F      F
F    T      T
F


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  逝去的感伤        
                
              
                            
                2021-02-20 18:32
              
            
            
                                                                       
You can read up on deriving boolean expressions from truth Tables (also see canonical form), on how you can express any truth table as a combination of boolean primitives or functions.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  别跟我提以往        
                
              
                            
                2021-02-20 18:36
              
            
            
                                                                       
~p | q


For visualization:

perl -e'printf "%x\n", (~0x1100 | 0x1010) & 0x1111'
1011


In tight code, this should be faster than "!p || q" because the latter has a branch, which might cause a stall in the CPU due to a branch prediction error.  The bitwise version is deterministic and, as a bonus, can do 32 times as much work in a 32-bit integer than the boolean version!
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤城傲影        
                
              
                            
                2021-02-20 18:45
              
            
            
                                                                       
FYI, with gcc-4.3.3:

int foo(int a, int b) { return !a || b; }
int bar(int a, int b) { return ~a | b; }


Gives (from objdump -d):

0000000000000000 <foo>:
   0:   85 ff                   test   %edi,%edi
   2:   0f 94 c2                sete   %dl
   5:   85 f6                   test   %esi,%esi
   7:   0f 95 c0                setne  %al
   a:   09 d0                   or     %edx,%eax
   c:   83 e0 01                and    $0x1,%eax
   f:   c3                      retq   

0000000000000010 <bar>:
  10:   f7 d7                   not    %edi
  12:   09 fe                   or     %edi,%esi
  14:   89 f0                   mov    %esi,%eax
  16:   c3                      retq   


So, no branches, but twice as many instructions.

And even better, with _Bool (thanks @litb):

_Bool baz(_Bool a, _Bool b) { return !a || b; }


0000000000000020 <baz>:
  20:   40 84 ff                test   %dil,%dil
  23:   b8 01 00 00 00          mov    $0x1,%eax
  28:   0f 45 c6                cmovne %esi,%eax
  2b:   c3                      retq   


So, using _Bool instead of int is a good idea.

Since I'm updating today, I've confirmed gcc 8.2.0 produces similar, though not identical, results for _Bool:

0000000000000020 <baz>:
  20:   83 f7 01                xor    $0x1,%edi
  23:   89 f8                   mov    %edi,%eax
  25:   09 f0                   or     %esi,%eax
  27:   c3                      retq   

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  盖世英雄少女心        
                
              
                            
                2021-02-20 18:47
              
            
            
                                                                       
!p || q


is plenty fast.  seriously, don't worry about it.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  既然无缘        
                
              
                            
                2021-02-20 18:48
              
            
            
                                                                       
Another solution for C booleans (a bit dirty, but works):

((unsigned int)(p) <= (unsigned int)(q))

It works since by the C standard, 0 represents false, and any other value true (1 is returned for true by boolean operators, int type).

The "dirtiness" is that I use booleans (p and q) as integers, which contradicts some strong typing policies (such as MISRA), well, this is an optimization question. You may always #define it as a macro to hide the dirty stuff.

For proper boolean p and q (having either 0 or 1 binary representations) it works. Otherwise T->T might fail to produce T if p and q have arbitrary nonzero values for representing true.

If you need to store the result only, since the Pentium II, there is the cmovcc (Conditional Move) instruction (as shown in Derobert's answer). For booleans, however even the 386 had a branchless option, the setcc instruction, which produces 0 or 1 in a result byte location (byte register or memory). You can also see that in Derobert's answer, and this solution also compiles to a result involving a setcc (setbe: Set if below or equal).

Derobert and Chris Dolan's ~p | q variant should be the fastest for processing large quantities of data since it can process the implication on all bits of p and q individually.

Notice that not even the !p || q solution compiles to branching code on the x86: it uses setcc instructions. That's the best solution if p or q may contain arbitrary nonzero values representing true. If you use the _Bool type, it will generate very few instructions.

I got the following figures when compiling for the x86:

__attribute__((fastcall)) int imp1(int a, int b)
{
 return ((unsigned int)(a) <= (unsigned int)(b));
}

__attribute__((fastcall)) int imp2(int a, int b)
{
 return (!a || b);
}

__attribute__((fastcall)) _Bool imp3(_Bool a, _Bool b)
{
 return (!a || b);
}

__attribute__((fastcall)) int imp4(int a, int b)
{
 return (~a | b);
}


Assembly result:

00000000 <imp1>:
   0:   31 c0                   xor    %eax,%eax
   2:   39 d1                   cmp    %edx,%ecx
   4:   0f 96 c0                setbe  %al
   7:   c3                      ret    

00000010 <imp2>:
  10:   85 d2                   test   %edx,%edx
  12:   0f 95 c0                setne  %al
  15:   85 c9                   test   %ecx,%ecx
  17:   0f 94 c2                sete   %dl
  1a:   09 d0                   or     %edx,%eax
  1c:   0f b6 c0                movzbl %al,%eax
  1f:   c3                      ret    

00000020 <imp3>:
  20:   89 c8                   mov    %ecx,%eax
  22:   83 f0 01                xor    $0x1,%eax
  25:   09 d0                   or     %edx,%eax
  27:   c3                      ret    

00000030 <imp4>:
  30:   89 d0                   mov    %edx,%eax
  32:   f7 d1                   not    %ecx
  34:   09 c8                   or     %ecx,%eax
  36:   c3                      ret    


When using the _Bool type, the compiler clearly exploits that it only has two possible values (0 for false and 1 for true), producing a very similar result to the ~a | b solution (the only difference being that the latter performs a complement on all bits instead of just the lowest bit).

Compiling for 64 bits gives just about the same results.

Anyway, it is clear, the method doesn't really matter from the point of avoiding producing conditionals.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复