Compress two or more numbers into one byte

问题

I think this is not really possible but worth asking anyway. Say I have two small numbers (Each ranges from 0 to 11). Is there a way that I can compress them into one byte and get them back later. How about with four numbers of similar sizes.

What I need is something like: a1 + a2 = x. I only know x and from that get a1, a2
For the second part: a1 + a2 + a3 + a4 = x. I only know x and from that get a1, a2, a3, a4
Note: I know you cannot unadd, just illustrating my question.

x must be one byte. a1, a2, a3, a4 range [0, 11].

回答1:

Thats trivial with bit masks. Idea is to divide byte into smaller units and dedicate them to different elements.

For 2 numbers, it can be like this: first 4 bits are number1, rest are number2. You would use number1 = (x & 0b11110000) >> 4, number2 = (x & 0b00001111) to retrieve values, and x = (number1 << 4) | number2 to compress them.

回答2:

For two numbers, sure. Each one has 12 possible values, so the pair has a total of 12^2 = 144 possible values, and that's less than the 256 possible values of a byte. So you could do e.g.

x = 12*a1 + a2
a1 = x / 12
a2 = x % 12

(If you only have signed bytes, e.g. in Java, it's a little trickier)

For four numbers from 0 to 11, there are 12^4 = 20736 values, so you couldn't fit them in one byte, but you could do it with two.

x = 12^3*a1 + 12^2*a2 + 12*a3 + a4
a1 = x / 12^3
a2 = (x / 12^2) % 12
a3 = (x / 12) % 12
a4 = x % 12

EDIT: the other answers talk about storing one number per four bits and using bit-shifting. That's faster.

回答3:

The 0-11 example is pretty easy -- you can store each number in four bits, so putting them into a single byte is just a matter of shifting one 4 bits to the left, and oring the two together.

Four numbers of similar sizes won't fit -- four bits apiece times four gives a minimum of 16 bits to hold them.

回答4:

If the numbers 0-11 aren't evenly distributed you can do even better by using shorter bit sequences for common values and longer ones for rarer values. It costs at least one bit to code which length you are using so there is a whole branch of CS devoted to proving when it's worth doing.

回答5:

Let's say it in general: suppose you want to mix N numbers a1, a2, ... aN, a1 ranging from 0..k1-1, a2 from 0..k2-1, ... and aN from 0 .. kN-1.

Then, the encoded number is:

encoded = a1 + k1*a2 + k1*k2*a3 + ... k1*k2*..*k(N-1)*aN

The decoding is then more tricky, stepwise:

rest = encoded
a1 = rest mod k1
rest = rest div k1

a2 = rest mod k2
rest = rest div k2

...

a(N-1) = rest mod k(N-1)
rest = rest div k(N-1)

aN = rest # rest is already < kN

回答6:

So a byte can hold upto 256 values or FF in Hex. So you can encode two numbers from 0-16 in a byte.

byte a1 = 0xf;
byte a2 = 0x9;
byte compress = a1 << 4 | (0x0F & a2);  // should yield 0xf9 in one byte.

4 Numbers you can do if you reduce it to only 0-8 range.

回答7:

Since a single byte is 8 bits, you can easily subdivide it, with smaller ranges of values. The extreme limit of this is when you have 8 single bit integers, which is called a bit field.

If you want to store two 4-bit integers (which gives you 0-15 for each), you simply have to do this:

value = a * 16 + b;

As long as you do proper bounds checking, you will never lose any information here.

To get the two values back, you just have to do this:

a = floor(value / 16)
b = value MOD 15

MOD is modulus, it's the "remainder" of a division.

If you want to store four 2-bit integers (0-3), you can do this:

value = a * 64 + b * 16 + c * 4 + d

And, to get them back:

a = floor(value / 64)
b = floor(value / 16) MOD 4
c = floor(value / 4) MOD 4
d = value MOD 4

I leave the last division as an exercise for the reader ;)

回答8:

@Mike Caron

your last example (4 integers between 0-3) is much faster with bit-shifting. No need for floor().

value = (a << 6) | (b << 4) | (c << 2) | d;

a = (value >> 6);
b = (value >> 4) % 4;
c = (value >> 2) % 4;
d = (value) % 4;

回答9:

Use Bit masking or Bit Shifting. The later is faster

Test out BinaryTrees for some fun. (it will be handing later on in dev life regarding data and all sorts of dev voodom lol)

回答10:

Packing four values into one number will require at least 15 bits. This doesn't fit in a single byte, but in two.

What you need to do is a conversion from base 12 to base 65536 and conversely.

B = A1 + 12.(A2 + 12.(A3 + 12.A4))

A1 = B % 12
A2 = (B / 12) % 12
A3 = (B / 144) % 12
A4 = B / 1728

As this takes 2 bytes anyway, conversion from base 12 to (packed) base 16 is by far prefable.

B1 = A1 + 256.A2
B2 = A3 + 256.A4

A1 = B1 % 256
A2 = B1 / 256
A3 = B2 % 256
A4 = B2 / 256

The modulos and divisions are implemented bymaskings and shifts.

回答11:

0-9 works much easier. You can easily store 11random order decimals in 4 1/2 bytes. Which is tighter compression than log(256)÷log(10). Just by creative mapping. Remember not all compression has to do with, dictionaries, redundancies, or sequences.

If you are talking of random numbers 0 - 9 you can have 4 digits per 14 bits not 15.

来源：https://stackoverflow.com/questions/3499444/compress-two-or-more-numbers-into-one-byte

标签

algorithm

math