What is the fastest way to return the positions of all set bits in a 64-bit integer?

后端 未结 10 1907
北荒
北荒 2020-12-13 04:05

I need a fast way to get the position of all one bits in a 64-bit integer. For example, given x = 123703, I\'d like to fill an array idx[] = {0, 1, 2, 4,

10条回答
  •  眼角桃花
    2020-12-13 04:44

    As a minimal modification:

    int64_t x;            
    char idx[K+1];
    char *dst=idx;
    const int BITS = 8;
    for (int i = 0 ; i < 64+BITS; i += BITS) {
      int y = (x & ((1<>= BITS;
    }
    

    The choice of BITS determines the size of the table. 8, 13 and 16 are logical choices. Each entry is a string, zero-terminated and containing bit positions with 1 offset. I.e. tab[5] is "\x03\x01". The inner loop fixes this offset.

    Slightly more efficient: replace the strcat and inner loop by

    char const* ptr = tab[y];
    while (*ptr)
    {
       *dst++ = *ptr++ + (i-1);
    }
    

    Loop unrolling can be a bit of a pain if the loop contains branches, because copying those branch statements doesn't help the branch predictor. I'll happily leave that decision to the compiler.

    One thing I'm considering is that tab[y] is an array of pointers to strings. These are highly similar: "\x1" is a suffix of "\x3\x1". In fact, each string which doesn't start with "\x8" is a suffix of a string which does. I'm wondering how many unique strings you need, and to what degree tab[y] is in fact needed. E.g. by the logic above, tab[128+x] == tab[x]-1.

    [edit]

    Nevermind, you definitely need 128 tab entries starting with "\x8" since they're never the suffix of another string. Still, the tab[128+x] == tab[x]-1 rule means that you can save half the entries, but at the cost of two extra instructions: char const* ptr = tab[x & 0x7F] - ((x>>7) & 1). (Set up tab[] to point after the \x8)

提交回复
热议问题