LLVM has phi instruction with quite weird explanation:
The \'phi\' instruction is used to implement the φ node in the SSA graph representing the function
A phi node is an instruction used to select a value depending on the predecessor of the current block (Look here to see the full hierarchy - it's also used as a value, which is one of the classes which it inherits from).
Phi nodes are necessary due to the structure of the SSA (static single assignment) style of the LLVM code - for example, the following C++ function
void m(bool r, bool y){
bool l = y || r ;
}
gets translated into the following IR: (created through clang -c -emit-llvm file.c -o out.bc
- and then viewed through llvm-dis
)
define void @_Z1mbb(i1 zeroext %r, i1 zeroext %y) nounwind {
entry:
%r.addr = alloca i8, align 1
%y.addr = alloca i8, align 1
%l = alloca i8, align 1
%frombool = zext i1 %r to i8
store i8 %frombool, i8* %r.addr, align 1
%frombool1 = zext i1 %y to i8
store i8 %frombool1, i8* %y.addr, align 1
%0 = load i8* %y.addr, align 1
%tobool = trunc i8 %0 to i1
br i1 %tobool, label %lor.end, label %lor.rhs
lor.rhs: ; preds = %entry
%1 = load i8* %r.addr, align 1
%tobool2 = trunc i8 %1 to i1
br label %lor.end
lor.end: ; preds = %lor.rhs, %entry
%2 = phi i1 [ true, %entry ], [ %tobool2, %lor.rhs ]
%frombool3 = zext i1 %2 to i8
store i8 %frombool3, i8* %l, align 1
ret void
}
So what happens here?
Unlike the C++ code, where the variable bool l
could be either 0 or 1, in the LLVM IR it has to be defined once. So we check if %tobool
is true, and then jump to lor.end
or lor.rhs
.
In lor.end
we finally have the value of the || operator. If we arrived from the entry block - then it's just true. Otherwise, it is equal to the value of %tobool2
- and that's exactly what we get from the following IR line:
%2 = phi i1 [ true, %entry ], [ %tobool2, %lor.rhs ]