Wrap around explanation for signed and unsigned variables in C?

后端 未结 4 1795
清酒与你
清酒与你 2020-11-27 06:31

I read a bit in C spec that unsigned variables(in particular unsigned short int) perform some so called wrap around on integer overflow, altho

4条回答
  •  庸人自扰
    2020-11-27 07:08

    Imagine you have a data type that's only 3 bits wide. This allows you to represent 8 distinct values, from 0 through 7. If you add 1 to 7, you will "wrap around" back to 0, because you don't have enough bits to represent the value 8 (1000).

    This behavior is well-defined for unsigned types. It is not well-defined for signed types, because there are multiple methods for representing signed values, and the result of an overflow will be interpreted differently based on that method.

    Sign-magnitude: the uppermost bit represents the sign; 0 for positive, 1 for negative. If my type is three bits wide again, then I can represent signed values as follows:

    000  =  0
    001  =  1
    010  =  2
    011  =  3
    100  = -0
    101  = -1
    110  = -2
    111  = -3
    

    Since one bit is taken up for the sign, I only have two bits to encode a value from 0 to 3. If I add 1 to 3, I'll overflow with -0 as the result. Yes, there are two representations for 0, one positive and one negative. You won't encounter sign-magnitude representation all that often.

    One's-complement: the negative value is the bitwise-inverse of the positive value. Again, using the three-bit type:

    000  =  0
    001  =  1
    010  =  2
    011  =  3
    100  = -3
    101  = -2
    110  = -1 
    111  = -0
    

    I have three bits to encode my values, but the range is [-3, 3]. If I add 1 to 3, I'll overflow with -3 as the result. This is different from the sign-magnitude result above. Again, there are two encodings for 0 using this method.

    Two's-complement: the negative value is the bitwise inverse of the positive value, plus 1. In the three-bit system:

    000  =  0
    001  =  1
    010  =  2
    011  =  3
    100  = -4
    101  = -3
    110  = -2
    111  = -1
    

    If I add 1 to 3, I'll overflow with -4 as a result, which is different from the previous two methods. Note that we have a slightly larger range of values [-4, 3] and only one representation for 0.

    Two's complement is probably the most common method of representing signed values, but it's not the only one, hence the C standard can't make any guarantees of what will happen when you overflow a signed integer type. So it leaves the behavior undefined so the compiler doesn't have to deal with interpreting multiple representations.

提交回复
热议问题