问题
I am implementing a VM compiler, and naturally, I've come to the point of implementing switches. Also naturally, for short switches, a sequential lookup array would be optimal but what about bigger switches?
So far I've come up with a data structure that gives me a pretty good lookup time. I don't know the name of that structure, but it is similar to a binary tree but monolith, with the difference it only applies to a static set of integers, cannot add or remove. It looks like a table, where value increases to the top and the right, here is an example:
Integers -89, -82, -72, -68, -65, -48, -5, 0, 1, 3, 7, 18, 27, 29, 32, 37, 38, 42, 45, 54, 76, 78, 87, 89, 92
and the table:
-65 3 32 54 92
-68 1 29 45 89
-82 -5 18 38 78
-89 -48 7 37 76
Which gives me the worst possible case width + height
iterations. Let's say the case is 37, -65 is less than 37, so move to the right, same for 3 move to the right, same for 32 move to the right, 54 is bigger so move down (step a width since it is a sequential array anyway), 45 is bigger so move down, 38 is bigger so move down and there we have 37 in 7 hops.
Is there any possible faster lookup algorithm?
Also, is there a name for this kind of arrangement? I came up with it on my own, but most likely someone else did that before me, so it is most probably named already.
EDIT: OK, as far as I got it, a "perfect hash" will offer me better THEORETICAL performance. But how will this play out in real life? If I understand correctly a two level "perfect hash" will be rather spread out instead of a continuous block of memory, so while the theoretical complexity is lower, there is a potential penalty of tens if not hundreds of cycles before that memory is fetched. In contrast, a slower theoretical worst case scenario will actually perform better just because it is more cache friendly than a perfect hash... Or not?
回答1:
When implementing switches among a diverse set of alternatives, you have several options:
- Make several groups of flat lookup arrays. For example, if you see numbers
1, 2, 3, 20000, 20001, 20002
, you could do a singleif
to take you to 1-s or to 20,000-s, and then employ two flat lookup arrays. - Discover a pattern. For example, if you see numbers
100, 200, 300, 400, 500, 600
, divide the number by100
, and then go for a flat lookup array. - Make a hash table. Since all the numbers that you are hashing are known to you, you can play with the load factor of the table to make sure that the lookup is not going to take a lot of probing.
Your algorithm is similar to binary search, in the sense that it's from the "divide an conquer" family. Such algorithms have logarithmic time complexity, which may not be acceptable for switches, because they are expected to be O(1)
.
回答2:
Is there any possible faster lookup algorithm?
Binary search is faster.
Binary search completes in log2(w*h) = log2(w) + log2(h).
Your algorithm completes in w+h.
来源:https://stackoverflow.com/questions/18335889/fastest-possible-lookup-for-known-static-set-of-integers