efficiently find the first element matching a bit mask

后端 未结 4 505
小蘑菇
小蘑菇 2020-12-15 07:57

I have a list of N 64-bit integers whose bits represent small sets. Each integer has at most k bits set to 1. Given a bit mask, I would lik

4条回答
  •  [愿得一人]
    2020-12-15 08:39

    A suffix tree (on bits) will do the trick, with the original priority at the leaf nodes:

    000000 -> 8
         1 -> 5
        10 -> 4
       100 -> 3
      1000 -> 2
        10 -> 1
       100 -> 0
     10000 -> 6
    100000 -> 7
    

    where if the bit is set in the mask, you search both arms, and if not, you search only the 0 arm; your answer is the minimum number you encounter at a leaf node.

    You can improve this (marginally) by traversing the bits not in order but by maximum discriminability; in your example, note that 3 elements have bit 2 set, so you would create

    2:0 0:0 1:0 3:0 4:0 5:0 -> 8
                        5:1 -> 5
                    4:1 5:0 -> 4
                3:1 4:0 5:0 -> 3
            1:1 3:0 4:0 5:0 -> 6
        0:1 1:0 3:0 4:0 5:0 -> 7
    2:1 0:0 1:0 3:0 4:0 5:0 -> 2
                    4:1 5:0 -> 1
                3:1 4:0 5:0 -> 0
    

    In your example mask this doesn't help (since you have to traverse both the bit2==0 and bit2==1 sides since your mask is set in bit 2), but on average it will improve the results (but at a cost of setup and more complex data structure). If some bits are much more likely to be set than others, this could be a huge win. If they're pretty close to random within the element list, then this doesn't help at all.

    If you're stuck with essentially random bits set, you should get about (1-5/64)^32 benefit from the suffix tree approach on average (13x speedup), which might be better than the difference in efficiency due to using more complex operations (but don't count on it--bit masks are fast). If you have a nonrandom distribution of bits in your list, then you could do almost arbitrarily well.

提交回复
热议问题