I have a 128-bit unsigned integer A and a 64-bit unsigned integer B. What\'s the fastest way to calculate A % B - that is the (64-bit) remainder from dividing A
As a general rule, division is slow and multiplication is faster, and bit shifting is faster yet. From what I have seen of the answers so far, most of the answers have been using a brute force approach using bit-shifts. There exists another way. Whether it is faster remains to be seen (AKA profile it).
Instead of dividing, multiply by the reciprocal. Thus, to discover A % B, first calculate the reciprocal of B ... 1/B. This can be done with a few loops using the Newton-Raphson method of convergence. To do this well will depend upon a good set of initial values in a table.
For more details on the Newton-Raphson method of converging on the reciprocal, please refer to http://en.wikipedia.org/wiki/Division_(digital)
Once you have the reciprocal, the quotient Q = A * 1/B.
The remainder R = A - Q*B.
To determine if this would be faster than the brute force (as there will be many more multiplies since we will be using 32-bit registers to simulate 64-bit and 128-bit numbers, profile it.
If B is constant in your code, you can pre-calculate the reciprocal and simply calculate using the last two formulae. This, I am sure will be faster than bit-shifting.
Hope this helps.