I have very frequently seen people discouraging others from using scanf
and saying that there are better alternatives. However, all I end up seeing is either
scanf
is awesome when you know your input is always well-structured and well-behaved. Otherwise...
IMO, here are the biggest problems with scanf
:
Risk of buffer overflow - if you do not specify a field width for the %s
and %[
conversion specifiers, you risk a buffer overflow (trying to read more input than a buffer is sized to hold). Unfortunately, there's no good way to specify that as an argument (as with printf
) - you have to either hardcode it as part of the conversion specifier or do some macro shenanigans.
Accepts inputs that should be rejected - If you're reading an input with the %d
conversion specifier and you type something like 12w4
, you would expect scanf
to reject that input, but it doesn't - it successfully converts and assigns the 12
, leaving w4
in the input stream to foul up the next read.
So, what should you use instead?
I usually recommend reading all interactive input as text using fgets
- it allows you to specify a maximum number of characters to read at a time, so you can easily prevent buffer overflow:
char input[100];
if ( !fgets( input, sizeof input, stdin ) )
{
// error reading from input stream, handle as appropriate
}
else
{
// process input buffer
}
One quirk of fgets
is that it will store the trailing newline in the buffer if there's room, so you can do an easy check to see if someone typed in more input than you were expecting:
char *newline = strchr( input, '\n' );
if ( !newline )
{
// input longer than we expected
}
How you deal with that is up to you - you can either reject the whole input out of hand, and slurp up any remaining input with getchar
:
while ( getchar() != '\n' )
; // empty loop
Or you can process the input you got so far and read again. It depends on the problem you're trying to solve.
To tokenize the input (split it up based on one or more delimiters), you can use strtok
, but beware - strtok
modifies its input (it overwrites delimiters with the string terminator), and you can't preserve its state (i.e., you can't partially tokenize one string, then start to tokenize another, then pick up where you left off in the original string). There's a variant, strtok_s
, that preserves the state of the tokenizer, but AFAIK its implementation is optional (you'll need to check that __STDC_LIB_EXT1__
is defined to see if it's available).
Once you've tokenized your input, if you need to convert strings to numbers (i.e., "1234"
=> 1234
), you have options. strtol
and strtod
will convert string representations of integers and real numbers to their respective types. They also allow you to catch the 12w4
issue I mentioned above - one of their arguments is a pointer to the first character not converted in the string:
char *text = "12w4";
char *chk;
long val;
long tmp = strtol( text, &chk, 10 );
if ( !isspace( *chk ) && *chk != 0 )
// input is not a valid integer string, reject the entire input
else
val = tmp;