Why is uint_least16_t faster than uint_fast16_t for multiplication in x86_64?
问题 The C standard is quite unclear about the uint_fast*_t family of types. On a gcc-4.4.4 linux x86_64 system, the types uint_fast16_t and uint_fast32_t are both 8 bytes in size. However, multiplication of 8-byte numbers seems to be fairly slower than multiplication of 4-byte numbers. The following piece of code demonstrates that: #include <stdio.h> #include <stdint.h> #include <inttypes.h> int main () { uint_least16_t p, x; int count; p = 1; for (count = 100000; count != 0; --count) for (x = 1;