Is there a way to encode any number into a series of 8-bit numbers, including a terminating character?

问题

So I would like to encode numbers as small as 0 and as high as very high (32-bit, 64-bit, other 8-bit multiples...). The simple approach is to just use the computer architecture's built-in support for "word" size or whatever, so like 32-bit or 64-bit are the common cases, so integers limited to that size. But I would like to do a theoretical thing and see if there is a way to encode arbitrarily large numbers using a sequence of 8-bit numbers.

But then as a caveat, I want to know when we've reached the end of a number in a stream of bytes. So you might have this stream of bytes:

nbbbbbbbbbbbbnnnnnbbbnnbnnnnnnnnnbbbbbbbnnbnnbb

...where n is the number and b is an arbitrary byte (this drawing isn't quite accurate to what I'm saying. n would be fairly few in sequence, while b would be relatively much larger). And the thing is, the n is the number of bytes b in front of it. So this way you can do this:

Read the number by combining the sequences of n somehow.
Skip that number of bytes to reach the next sequence of n.
Repeat.

The question is two parts:

How do you compute the number out of a sequence of 8-bit integers?
Such that, you also know when you've reached the end of the "number" encoding and are now at the "arbitrary byte" encoding section. Somehow you need to reserve some key numbers or bits to flag when you've reached the end of a number encoding, but I haven't figured this out.

Any ideas how to accomplish this?

回答1:

MSB-first VLQ could be decoded into a BigInt like this:

function decode(bytes, index) {
    index |= 0;
    var value = 0n;
    var t;
    do {
        t = bytes[index++];
        value = (value << 7n) | BigInt(t & 0x7F);
    } while (t >= 0x80);
    return { value: value, index: index };
}

The "end" position is also returned. It's really the position of the next thing in the data.

来源：https://stackoverflow.com/questions/59061720/is-there-a-way-to-encode-any-number-into-a-series-of-8-bit-numbers-including-a

标签

math

encoding

bit-manipulation

byte