Difference between scanf() and strtol() / strtod() in parsing numbers

后端 未结 8 1608
陌清茗
陌清茗 2020-12-05 20:28

Note: I completely reworked the question to more properly reflect what I am setting the bounty for. Please excuse any inconsistencies with already-given ans

8条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2020-12-05 21:00

    Answer obsolete after rewrite of question. Some interesting links in the comments though.


    If in doubt, write a test. -- proverb

    After testing all combinations of conversion specifiers and input variations I could think of, I can say that it is correct that the two function families do not give identical results. (At least in glibc, which is what I have available for testing.)

    The difference appears when three circumstances meet:

    1. You use "%i" or "%x" (allowing hexadecimal input).
    2. Input contains the (optional) "0x" hexadecimal prefix.
    3. There is no valid hexadecimal digit following the hexadecimal prefix.

    Example code:

    #include 
    #include 
    
    int main()
    {
        char * string = "0xz";
        unsigned u;
        int count;
        char c;
        char * endptr;
    
        sscanf( string, "%x%n%c", &i, &count, &c );
        printf( "Value: %d - Consumed: %d - Next char: %c - (sscanf())\n", u, count, c );
        i = strtoul( string, &endptr, 16 );
        printf( "Value: %d - Consumed: %td - Next char: %c - (strtoul())\n", u, ( endptr - string ), *endptr );
        return 0;
    }
    

    Output:

    Value: 0 - Consumed: 1 - Next char: x - (sscanf())
    Value: 0 - Consumed: 0 - Next char: 0 - (strtoul())
    

    This confuses me. Obviously sscanf() does not bail out at the 'x', or it wouldn't be able to parse any "0x" prefixed hexadecimals. So it has read the 'z' and found it non-matching. But it decides to use only the leading "0" as value. That would mean pushing the 'z' and the 'x' back. (Yes I know that sscanf(), which I used here for easy testing, does not operate on a stream, but I strongly assume they made all ...scanf() functions behave identically for consistency.)

    So... one-char ungetc() doesn't really to be the reason, here... ?:-/

    Yes, results differ. I still cannot explain it properly, though... :-(

提交回复
热议问题