Reversibly encode two large integers of different bit lengths into one integer

问题

I want to encode two large integers of possibly different maximum bit lengths into a single integer. The first integer is signed (can be negative) whereas the second is unsigned (always non-negative). If the bit lengths are m and n respectively, the bit length of the returned integer should be less than or equal to m + n.

Just n (but not m) is known in advance and is fixed. The solution will as an example be used to combine a signed nanosecond timestamp of 61+ bits along with 256 bits of unsigned randomness to form a signed 317+ bit unique identifier.

I'm using the latest Python. There is a related preexisting question which addresses this in the special case when m == n.

回答1:

Since n is fixed, the problem is trivial: Encode (a, b) as a•2ⁿ+b.

If m and n were not fixed, the problem is impossible because it asks us to encode both (two-bit a, one-bit b) and (one-bit a, two-bit b) in three bits, which means we must encode the twelve possibilities (0, 0), (0, 1), (0, 2), (0, 3), (1, 0), (1, 1), (1, 2), (1, 3), (2, 0), (2, 1), (3, 0), and (3, 1) in the eight combinations of three bits, which is impossible.

回答2:

This solution uses basic bit shifting and bit extraction. Using bit operations should be faster than using higher level operations such as exponentiation and multiplication.

The fundamental solution is much the same as in the special case, since only one integer's maximum bit length is required in either case. The tests, however, are not.

from typing import Tuple
import unittest


class IntMerger:
    """Reversibly encode two integers into a single integer.

    Only the first integer can be signed (possibly negative). The second
    integer must be unsigned (always non-negative).

    In the merged integer, the left bits are of the first input integer, and
    the right bits are of the second input integer.
    """
    # Ref: https://stackoverflow.com/a/54164324/
    def __init__(self, num_bits_int2: int):
        """
        :param num_bits_int2: Max bit length of second integer.
        """
        self._num_bits_int2: int = num_bits_int2
        self._max_int2: int = self._max_int(self._num_bits_int2)

    @staticmethod
    def _max_int(num_bits: int) -> int:
        return (1 << num_bits) - 1

    def merge(self, int1: int, int2: int) -> int:
        return (int1 << self._num_bits_int2) | int2

    def split(self, int12: int) -> Tuple[int, int]:
        int1 = int12 >> self._num_bits_int2
        int2 = int12 & self._max_int2
        return int1, int2


class TestIntMerger(unittest.TestCase):
    def test_intmerger(self):
        max_num_bits = 8
        for num_bits_int1 in range(max_num_bits + 1):
            for num_bits_int2 in range(max_num_bits + 1):
                expected_merged_max_num_bits = num_bits_int1 + num_bits_int2
                merger = IntMerger(num_bits_int2)
                maxint1 = (+1 << num_bits_int1) - 1
                minint1 = (-1 << num_bits_int1) + 1
                for int1 in range(minint1, maxint1 + 1):
                    for int2 in range(1 << num_bits_int2):
                        int12 = merger.merge(int1, int2)
                        # print(f'{int1} ({num_bits_int1}b), {int2} ({num_bits_int2}b) = {int12} ({int12.bit_length()}b)')
                        self.assertLessEqual(int12.bit_length(), expected_merged_max_num_bits)
                        self.assertEqual((int1, int2), merger.split(int12))
                self.assertEqual(int12.bit_length(), expected_merged_max_num_bits)


if __name__ == '__main__':
    unittest.main()

Usage examples:

>>> merger = IntMerger(12)

>>> merger.merge(13, 8)
53256
>>> merger.split(_)
(13, 8)

>>> merger.merge(-13, 8)
-53240
>>> merger.split(_)
(-13, 8)

回答3:

If you absolutely MUST have full reversibility, you need to relax at least one of your implied initial conditions (because if you don't separately remember any of those numbers and response bitlength R is smaller than m+n, you just irrevocably lost full reversibility):

EITHER you should have R exactly equal m+n, in which case, the easiest way is to left-shift the m-length one by n bits, then add the n-bit number (to reverse, make a copy, right-shift by n bits to get the m-length one, left-shift by n bits and either subtract/bitwise-XOR from/with encoded number to get the n-length one),
OR you should separately remember one of the numbers somewhere/somehow (hopefully it's common for the user?) and just bitwise-XOR the numbers (to reverse, just bit-wise XOR result with the stored number); bonus points, if it's common for the user, any extra encoded ID per user past the first one only adds max(m,n) bits of data to storage needs.

来源：https://stackoverflow.com/questions/54164323/reversibly-encode-two-large-integers-of-different-bit-lengths-into-one-integer

标签

python

integer

bit-manipulation

uniqueidentifier