Why is memcmp(a, b, 4) only sometimes optimized to a uint32 comparison?
问题 Given this code: #include <string.h> int equal4(const char* a, const char* b) { return memcmp(a, b, 4) == 0; } int less4(const char* a, const char* b) { return memcmp(a, b, 4) < 0; } GCC 7 on x86_64 introduced an optimization for the first case (Clang has done it for a long time): mov eax, DWORD PTR [rsi] cmp DWORD PTR [rdi], eax sete al movzx eax, al But the second case still calls memcmp() : sub rsp, 8 mov edx, 4 call memcmp add rsp, 8 shr eax, 31 Could a similar optimization be applied to