Each stream has \"an error indicator that records whether a read/write error has occurred\".
It is set, usually rarely, by various functions: fgetc(
Assuming no UB, are 5 of the 8 possible and not the 3 unexpected ones? especially is valid input possible with error indicator set?
Speaking specifically to the provisions of the standard, I'm inclined to agree with your analysis:
Few functions are specified to clear the error indicator of a stream, and fgetc()
is not one of them. More generally, none of them are data-transfer functions. Therefore, if the error indicator is set for a stream before that stream is presented to fgetc()
for reading, then it should still be set when that function returns, all other considerations notwithstanding. That covers these cases:*
1 0 0 Unexpected
1 1 0 Unexpected
1 1 1 Input error or end-of-file
It also covers this case with respect to the expected value of the error indicator, though it does not speak to whether it can actually happen:
1 0 1 Normal reading of valid data with error indicator set!
fgetc()
is specified to return EOF
in every situation in which it is specified to set the end-of-file indicator on a stream. Therefore, if fgetc()
returns anything other than EOF
then it will not, on that call, have set the stream's error (or end-of-file) indicator. That covers these cases:
0 0 0 Normal reading of valid data
0 0 1 Unexpected
On the other hand, if fgetc()
does return EOF
then either the stream's end-of-file indicator or its error indicator should afterward be found set. But the standard distinguishes between these cases, and specifies that the user can distinguish them via the feof()
and ferror()
functions. That covers these cases:*
0 1 0 End-of-file
0 1 1 Input error
Finally, I concur that none of the behavior of fgetc()
is conditioned on the initial state of the stream's error indicator. Provided only that the stream is not initially positioned at its end, and its end-of-file indicator is not initially set, "the fgetc
function returns the next character from the input stream pointed to by stream." That establishes that this, the case of most interest, is in fact allowed:
1 0 1 Normal reading of valid data with error indicator set!
However, that the case is allowed in the abstract does not imply that it can be observed in practice. The details seem unspecified, and I would expect them to depend on the implementation of the driver serving the stream in question. It is entirely possible that having once encountered an error, the driver will continue to report an error on subsequent reads until reset appropriately, and perhaps longer. From the C perspective, that would be interpreted as an (additional) error occurring on each subsequent read, and nothing in the language specifications prevents that. Not even use of one of the functions that clear a stream's error indicator.
If codes does not clear the error indicator before hand and wants to detect if a line of input had a rare input error, it seems to make sense to test
!feof()
and notferror()
to detect.Is checking
ferror()
potentially misleading? or have I missed something about the error indicator?
I agree that if a stream's error indicator is initially set, its end-of-file indicator is not, and reading it with fgetc()
returns EOF
, then ferror()
does not usefully distinguish between the end-of-file and error cases whereas feof()
should.
On the other hand, whether one can usefully continue to read a given stream after an error has been encountered on it depends on implementation and possibly on specific circumstances. That applies even if the error indicator is cleared via a clearerr()
call, not to mention if the error indicator is not cleared.
* Although I agree that there is an ambiguity with respect to EOF
in the event that UCHAR_MAX > INT_MAX
, I assert that that is just one of several reasons why such an implementation would be problematic. As a practical matter, therefore, I disregard such implementations as entirely hypothetical.
Everything you say seems right, and ch==EOF && !feof(f)
is the right way to check for new errors without interfering with error accumulation.
Here is a very crude, very minimal program to explore the behaviour of GNU C library with respect to fgetc()
, ferror()
, and feof()
, as requested by OP in a comment:
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <signal.h>
#include <errno.h>
static volatile sig_atomic_t interrupted = 0;
static void interrupt_handler(int signum)
{
interrupted = 1;
}
static int install_interrupt(const int signum)
{
struct sigaction act;
memset(&act, 0, sizeof act);
sigemptyset(&act.sa_mask);
act.sa_handler = interrupt_handler;
act.sa_flags = 0;
if (sigaction(signum, &act, NULL) == -1)
return -1;
return 0;
}
int main(void)
{
int n, c;
if (install_interrupt(SIGALRM)) {
fprintf(stderr, "Cannot install SIGALRM handler: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
if (ferror(stdin)) {
fprintf(stderr, "Standard input is already in error state.\n");
return EXIT_FAILURE;
}
if (feof(stdin)) {
fprintf(stderr, "Standard input is already in end-of-input state.\n");
return EXIT_FAILURE;
}
fprintf(stderr, "Testing stream error state. Please wait.\n");
alarm(1);
c = fgetc(stdin);
if (c != EOF) {
fprintf(stderr, "Stream error state test failed.\n");
return EXIT_FAILURE;
}
fprintf(stderr, "fgetc(stdin) returned EOF.\n");
fprintf(stderr, "ferror(stdin) returns %d.\n", ferror(stdin));
fprintf(stderr, "feof(stdin) returns %d.\n", feof(stdin));
fprintf(stderr, "\n");
fprintf(stderr, "Testing stream end-of-input state. Please press Ctrl+D.\n");
c = fgetc(stdin);
if (c != EOF) {
fprintf(stderr, "fgetc() returned %d; EOF was expected.\n", c);
return EXIT_FAILURE;
}
fprintf(stderr, "fgetc(stdin) returned EOF.\n");
fprintf(stderr, "ferror(stdin) returns %d.\n", ferror(stdin));
fprintf(stderr, "feof(stdin) returns %d.\n", feof(stdin));
if (!ferror(stdin) || !feof(stdin)) {
fprintf(stderr, "Expected error and end-of-file states; aborting.\n");
return EXIT_FAILURE;
}
fprintf(stderr, "\n");
fprintf(stderr, "Testing fgetc() when stream in error and end-of-file state.\n");
fprintf(stderr, "Please type something, then press Enter.\n");
n = 0;
c = fgetc(stdin);
while (c != EOF && c != '\n') {
n++;
c = fgetc(stdin);
}
if (c == EOF) {
fprintf(stderr, "Further input is not possible.\n");
return EXIT_FAILURE;
} else
fprintf(stderr, "Further input is possible: %d characters (including Enter) read.\n", n + 1);
return EXIT_SUCCESS;
}
When I compile and run the above on Linux, the program will output
Testing stream error state. Please wait.
fgetc(stdin) returned EOF.
ferror(stdin) returns 1.
feof(stdin) returns 0.
The error state was caused by having signal delivery interrupt an fgetc(stdin)
call. As you can see, it does cause ferror(stdin)
to return nonzero. Note that feof(stdin)
returns 0, though.
The output continues:
Testing stream end-of-input state. Please press Ctrl+D.
Pressing Ctrl+C yields output
fgetc(stdin) returned EOF.
ferror(stdin) returns 1.
feof(stdin) returns 1.
At this point, the standard input is in both error and end-of-file states. The output continues:
Testing fgetc() when stream in error and end-of-file state.
Please type something, then press Enter.
If we now type say O K Enter, we get
Further input is possible: 3 characters (including Enter) read.
This proves that at least the GNU C library implementation does not check the stream error or end-of-file status at all. It will simply try to read more data (using the underlying read()
operation in POSIXy systems).
My reading of the standard is that it doesn't explicitly say fgetc
is allowed to return a non-EOF
value if the error indicator was already set on the stream on entry, but it doesn't explicitly say that it can't, either. I sympathize with Nominal Animal's observation (which I shall hoist from comments on his answer in case it gets deleted or moved to chat; allow me to grind my personal axe for a moment and observe that the policy of treating comments as "ephemeral" is harmful and should be abolished):
IMHO the standard is then bass-ackwards: there is no practical need for EOF to be sticky, but if error is not sticky, then there is a real risk of accidentally missing errors.
However, if existing implementations are all consistently not treating error as sticky, changing the behavior will be very hard to sell to the committee. Therefore, I am soliciting tests from the community:
Below is a shortened, non-interactive version of Nominal Animal's test program. It only looks at the behavior of fgetc
after a read error, not after EOF. It uses SIGALRM
to interrupt a read, instead of control-C, so you don't have to do anything but run it.
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
#include <stdlib.h>
static _Noreturn void
perror_exit (const char *msg)
{
perror (msg);
exit (1);
}
static void
handler (int unused)
{
}
int
main (void)
{
struct sigaction sa;
int pipefd[2];
FILE *fp;
int ch, pa;
setvbuf (stdout, 0, _IOLBF, 0);
sa.sa_handler = handler;
sa.sa_flags = 0; /* DO interrupt blocking system calls */
sigemptyset (&sa.sa_mask);
if (sigaction (SIGALRM, &sa, 0))
perror_exit ("sigaction");
if (pipe (pipefd))
perror_exit ("pipe");
fp = fdopen (pipefd[0], "r");
if (!fp)
perror_exit ("fdopen");
printf ("before fgetc 1, feof = %d ferror = %d\n",
feof (fp), ferror (fp));
alarm (1);
ch = fgetc (fp);
if (ch == EOF)
printf ("after fgetc 1, ch = EOF feof = %d ferror = %d\n",
feof (fp), ferror (fp));
else
printf ("after fgetc 1, ch = '%c' feof = %d ferror = %d\n",
ch, feof (fp), ferror (fp));
write (pipefd[1], "x", 1);
alarm (1);
ch = fgetc (fp);
pa = alarm (0);
printf ("after fgetc 2, alarm %s\n",
pa ? "did not fire" : "fired");
if (ch == EOF)
printf ("after fgetc 2, ch = EOF feof = %d ferror = %d\n",
feof (fp), ferror (fp));
else
printf ("after fgetc 2, ch = '%c' feof = %d ferror = %d\n",
ch, feof (fp), ferror (fp));
return 0;
}
On all of the Unixes I can get at at the moment, this program's output is consistent with John Bollinger's observation that
the case of most interest, is in fact allowed:
1 0 1 Normal reading of valid data with error indicator set!
I would particularly like to know what this program prints when run on alternative Linux-based C libraries (e.g. musl, bionic); Unixes which are not Linux nor are they BSD-phylum; and Windows. If you've got anything even more exotic please try that too. I'm marking this post community wiki; please edit it to add test results.
The test program should be acceptable to any C89-compliant compiler for an environment where unistd.h
exists and signal.h
defines sigaction
, except for one use of the C11 _Noreturn
keyword which is only to squelch warnings. If your compiler complains about _Noreturn
, compile with -D_Noreturn=
; the results will not be affected. If you don't have unistd.h
, the test program will not do anything meaningful in your environment. If you don't have sigaction
you may be able to adapt the program to use alternative interfaces, but you need to persuade SIGALRM
to interrupt a blocking read
somehow.
before fgetc 1, feof = 0 ferror = 0
after fgetc 1, ch = EOF feof = 0 ferror = 1
after fgetc 2, alarm did not fire
after fgetc 2, ch = 'x' feof = 0 ferror = 1
("normal reading of valid data with error indicator set")
.
before fgetc 1, feof = 0 ferror = 0
after fgetc 1, ch = EOF feof = 0 ferror = 1
after fgetc 2, alarm did not fire
after fgetc 2, ch = EOF feof = 0 ferror = 1
("sticky error" behavior: fgetc(fp)
immediately returns EOF without calling read
when ferror(fp)
is true on entry)
.