Is CRC32 additive?

前端 未结 4 994
遥遥无期
遥遥无期 2021-01-04 12:41

On several places I\'ve read that crc32 is additive and so: CRC(A xor B) = CRC(A) xor CRC(B).

The above statement was disproven by the following code I wrote:

<
4条回答
  •  夕颜
    夕颜 (楼主)
    2021-01-04 13:01

    CRC is additive in the mathematical sense since the CRC hash is just a remainder value from a carryless division of all the data (treated as a giant integer) divided by the polynomial constant. Using your example, it's akin to this sort of thing:

    7 mod 5 = 2

    6 mod 5 = 1

    (7 mod 5) + (6 mod 5) = 3

    (7 + 6) mod 5 = 3

    In that analogy, '5' is our CRC polynomial.

    Here's an example to play with (gcc based):

    #include 
    #include 
    
    int main(void)
    {
            unsigned int crc_a = __builtin_ia32_crc32si( 0, 5);
            printf( "crc(5) = %08X\n", crc_a );
            unsigned int crc_b = __builtin_ia32_crc32si( 0, 7);
            printf( "crc(7) = %08X\n", crc_b );
            unsigned int crc_xor = crc_a ^ crc_b;
            printf( "crc(5) XOR crc(7) = %08X\n", crc_xor );
            unsigned int crc_xor2 = __builtin_ia32_crc32si( 0, 5 ^ 7);
            printf( "crc(5 XOR 7) = %08X\n", crc_xor2 );
    
            return 0;
    }
    

    The output is as expected:

    plxc15034> gcc -mcrc32 -Wall -O3 crctest.c
    plxc15034> ./a.out
    crc(5) = A6679B4B
    crc(7) = 1900B8CA
    crc(5) XOR crc(7) = BF672381
    crc(5 XOR 7) = BF672381
    

    Because this code uses the x86 CRC32 instruction, it will only run on an Intel i7 or newer. The intrinsic function takes the running CRC hash as the first parameter and the new data to accumulate as the second parameter. The return value is the new running CRC.

    The initial running CRC value of 0 in the code above is critical. Using any other initial value, then CRC is not "additive" in the practical sense because you have effectively thrown away information about the integer you are dividing into. And this is exactly what's happening in your example. CRC functions never initialize that initial running CRC value to zero, but usually -1. The reason is that an initial CRC of 0 allows any number of leading 0's in the data to simply fall through without changing the running CRC value, which remains 0. So, initializing the CRC to 0 is mathematically sound, but for practical purposes of calculating hash, it's the last thing you'd want.

提交回复
热议问题