问题
This example is from the K&R book
#include<stdio.h>
main()
{
long nc;
nc = 0;
while(getchar() != EOF)
++nc;
printf("%ld\n", nc);
}

Could you explain me why it works that way. Thanks.
^Z^Z doesn't work either (unless it's in the beginning of a line)

回答1:
Traditional UNIX interpretation of tty EOF
character is to make blocking read
return after reading whatever is buffered inside a cooked tty line buffer. In the start of a new line, it means read
returning 0 (reading zero bytes), and incidentally, 0-sized read
is how the end of file condition on ordinary files is detected.
That's why the first EOF
in the middle of a line just forces the beginning of the line to be read
, not making C runtime library detect an end of file. Two EOF
characters in a row produce 0-sized read, because the second one forces an empty buffer to be read
by an application.
$ cat
foo[press ^D]foo <=== after ^D, input printed back before EOL, despite cooked mode. No EOF detected
foo[press ^D]foo[press ^D] <=== after first ^D, input printed back, and on second ^D, cat detects EOF
$ cat
Some first line<CR> <=== input
Some first line <=== the line is read and printed
[press ^D] <=== at line start, ^D forces 0-sized read to happen, cat detects EOF
I assume that your C runtime library imitates the semantics described above (there is no special handling of ^Z
at the level of kernel32
calls, let alone system calls, on Windows). That's why it would probably detect EOF after ^Z^Z
even in the middle of an input line.
回答2:
The program will read EOF
only at the actual end of the input. If your terminal/OS/whatever only permit files to end at the start of a line then that's where you'll find them. I believe this is a throw-back to old-fashioned terminals where data was only transmitted a line at a time (for all I know it goes back to punched card readers).
Try reading your data from a file that you've preprepared with an EOF mid-line. You may even find that some editors make this difficult! Your program should work fine with that as input.
回答3:
EOF indicates "end of file". A newline (which is what happens when you press enter) isn't the end of a file, it's the end of a line, so a newline doesn't terminate this loop.
Depending on the operating system, EOF
character will only work if it's the first character on a line, i.e. the first character after an Enter
. Since console input is often line-oriented, the system may also not recognize the EOF
character until after you've followed it up with an Enter
.
回答4:
I happened to have the same question as you. When I want to end the function getchar()
, I have to enter 2 EOF
or enter a <ENTER>
plus a EOF
.
And here's an easier answer I searched about this question:
If there is characters entering in the terminal, EOF will play the role as stopping this entering, which will arouse a new turn of entering; while, if there is no entering happening, or in another word, when the getchar() is waiting for a new enter(such as you've just finished entering or a EOF), the EOF you are about to enter now equals "end of file", which will lead the program stop executing the function getchar().
PS: the question happens when you are using getchar()
. I think this answer is easier to understand, but maybe not for you since it is translated from Chinese...
来源:https://stackoverflow.com/questions/14436596/why-does-getchar-recognize-eof-only-in-the-beginning-of-a-line