I\'ve been trying to implement Rabin-Karp algorithm in Java. I have hard time computing the rolling hash value in constant time. I\'ve found one implementation at http://algs4.c
This is the "rolling" aspect of the hash. It's eliminating the contribution of the oldest character (txt.charAt(i-M)
), and incorporating the contribution of the newest character(txt.charAt(i)
).
The hash function is defined as:
M-1
hash[i] = ( SUM { input[i-j] * R^j } ) % Q
j=0
(where I'm using ^
to denote "to the power of".)
But this can be written as an efficient recursive implementation as:
hash[i] = (txtHash*R - input[i-M]*(R^M) + input[i]) % Q
Your reference code is doing this, but it's using various techniques to ensure that the result is always computed correctly (and efficiently).
So, for instance, the + Q
in the first expression has no mathematical effect, but it ensures that the result of the sum is always positive (if it goes negative, % Q
doesn't have the desired effect). It's also breaking the calculation into stages, presumably to prevent numerical overflow.