I have many truth-tables of many variables (7 or more) and I use a tool (eg logic friday 1) to simplify the logic formula. I could do that by hand but that is much too error pro
Outside of just leaving it to the compiler, or the hand-wavy suggestions in the 2nd section of my answer, see HJLebbink's self-answer using FPGA logic-optimization tools. (This answer ended up with the bounty because it failed to attract such an answer from anyone else; it's not really bounty-worthy. :/ I wrote it before there was a bounty, but don't have anything else useful to add.)
ICC18 optimizes chained _mm512_and/or/xor_epi32
intrinsics into vpternlogd
instructions, but gcc/clang don't.
On Godbolt for this and a more complicated function using some inputs multiple times:
#include
__m512i logic(__m512i a, __m512i b, __m512i c,
__m512i d, __m512i e, __m512i f, __m512i g) {
// return _mm512_and_epi32(_mm512_and_epi32(a, b), c);
return a & b & c & d & e & f;
}
gcc -O3 -march=skylake-avx512
nightly build
logic:
vpandq zmm4, zmm4, zmm5
vpandq zmm3, zmm2, zmm3
vpandq zmm4, zmm4, zmm3
vpandq zmm0, zmm0, zmm1
vpandq zmm0, zmm4, zmm0
ret
ICC18 -O3 -march=skylake-avx512
logic:
vpternlogd zmm2, zmm0, zmm1, 128 #6.21
vpternlogd zmm4, zmm2, zmm3, 128 #6.29
vpandd zmm0, zmm4, zmm5 #6.33
ret #6.33
IDK how good it is at picking optimal solutions when each variable is used more than once in different subexpressions.
To see if it does a good job, you have to do the optimization yourself. You want to find sets of 3 variables that can be combined together into a single boolean value without still needing those 3 variables anywhere else in the expression.
I think it's possible for a truth table with more than 3 inputs to not simplify down this way, to a smaller truth table where one of the columns is the result of a ternary combination of 3 of the inputs. e.g. I think it's not guaranteed that it's possible to simplify a 4 input function to vpternlog + AND, OR, or XOR.
I'd definitely worry that compilers might pick 3 inputs to combine that didn't result in as much simplification as a different choice of 3.
It might even be optimal for a compiler to start with a binary operation or two on a couple pairs to set up for a ternary operation, especially if that enables better ILP.
You could probably write a brute-force truth-table optimizer that looked for triplets of variables that could be combined to make a smaller table for just the ternary result and the rest of the table. But I'm not sure a greedy approach is guaranteed to give the best results. If there are multiple ways to combine with the same total instruction count, they're probably not all equivalent for ILP (Instruction Level Parallelism).