The following code calls the builtin functions for clz/ctz in GCC and, on other systems, has C versions. Obviously, the C versions are a bit suboptimal if the system has a
There are two intrinsics "_BitScanForward" and "_BitScanReverse", which suits the same purpose for MSVC. Include . The functions are:
#ifdef _MSC_VER
#include <intrin.h>
static uint32_t __inline ctz( uint32_t x )
{
int r = 0;
_BitScanReverse(&r, x);
return r;
}
static uint32_t __inline clz( uint32_t x )
{
int r = 0;
_BitScanForward(&r, x);
return r;
}
#endif
There are equivalent 64bit versions "_BitScanForward64" and "_BitScanReverse64".
Read more here:
x86 Intrinsics on MSDN
I find it in a korean website https://torbjorn.tistory.com/317
In msvc compiler, you can use __lzcnt(unsigned int)
to replace __builtin_clz(unsigned int)
in gcc compiler.
If MSVC has a compiler intrinsic for this, it'll be here:
Compiler Intrinsics on MSDN
Otherwise, you'll have to write it using __asm
Bouncing from sh0dan code, the implementation should be corrected like this :
#ifdef _MSC_VER
#include <intrin.h>
uint32_t __inline ctz( uint32_t value )
{
DWORD trailing_zero = 0;
if ( _BitScanForward( &trailing_zero, value ) )
{
return trailing_zero;
}
else
{
// This is undefined, I better choose 32 than 0
return 32;
}
}
uint32_t __inline clz( uint32_t value )
{
DWORD leading_zero = 0;
if ( _BitScanReverse( &leading_zero, value ) )
{
return 31 - leading_zero;
}
else
{
// Same remarks as above
return 32;
}
}
#endif
As commented in the code, both ctz and clz are undefined if value is 0. In our abstraction, we fixed __builtin_clz(value)
as (value?__builtin_clz(value):32)
but it's a choice
Tested on linux and windows (x86) :
#ifdef WIN32
#include <intrin.h>
static uint32_t __inline __builtin_clz(uint32_t x) {
unsigned long r = 0;
_BitScanReverse(&r, x);
return (31-r);
}
#endif
uint32_t clz64(const uint64_t x)
{
uint32_t u32 = (x >> 32);
uint32_t result = u32 ? __builtin_clz(u32) : 32;
if (result == 32) {
u32 = x & 0xFFFFFFFFUL;
result += (u32 ? __builtin_clz(u32) : 32);
}
return result;
}