Why do I have to press Ctrl+D twice to close stdin?

问题

I have the following Python script that reads numbers and outputs an error if the input is not a number.

import fileinput
import sys
for line in (txt.strip() for txt in fileinput.input()):
    if not line.isdigit():
        sys.stderr.write("ERROR: not a number: %s\n" % line)

If I get the input from stdin, I have to press Ctrl + D twice to end the program. Why?

I only have to press Ctrl + D once when I run the Python interpreter by itself.

bash $ python test.py
1
2
foo
4
5
<Ctrl+D>
ERROR: not a number: foo
<Ctrl+D>
bash $

回答1:

In Python 3, this was due to a bug in Python's standard I/O library. The bug was fixed in Python 3.3.

In a Unix terminal, typing Ctrl+D doesn't actually close the process's stdin. But typing either Enter or Ctrl+D does cause the OS read system call to return right away. So:

>>> sys.stdin.read(100)
xyzzy                       (I press Enter here)
                            (I press Ctrl+D once)
'xyzzy\n'
>>>

sys.stdin.read(100) is delegated to sys.stdin.buffer.read, which calls the system read() in a loop until either it accumulates the full requested amount of data; or the system read() returns 0 bytes; or an error occurs. (docs) (source)

Pressing Enter after the first line caused the system read() to return 6 bytes. sys.stdin.buffer.read called read() again to try to get more input. Then I pressed Ctrl+D, causing read() to return 0 bytes. At this point, sys.stdin.buffer.read gave up and returned just the 6 bytes it had collected earlier.

Note that the process still has my terminal on stdin, and I can still type stuff.

>>> sys.stdin.read()        (note I can still type stuff to python)
xyzzy                       (I press Enter)
                            (Press Ctrl+D again)
'xyzzy\n'

OK. This is the part that was busted when this question was originally asked. It works now. But prior to Python 3.3, there was a bug.

The bug was a little complicated --- basically the problem was that two separate layers were doing the same work. BufferedReader.read() was written to call self.raw.read() repeatedly until it returned 0 bytes. However, the raw method, FileIO.read(), performed a loop-until-zero-bytes of its own. So the first time you press Ctrl+D in a Python with this bug, it would cause FileIO.read() to return 6 bytes to BufferedReader.read(), which would then immediately call self.raw.read() again. The second Ctrl+D would cause that to return 0 bytes, and then BufferedReader.read() would finally exit.

This explanation is unfortunately much longer than my previous one, but it has the virtue of being correct. Bugs are like that...

回答2:

Most likely this has to do with Python the following Python issues:

5505: sys.stdin.read() doesn't return after first EOF on Windows, and
1633941: for line in sys.stdin: doesn't notice EOF the first time.

回答3:

I wrote an explanation about this in my answer to this question.

How to capture Control+D signal?

In short, Control-D at the terminal simply causes the terminal to flush the input. This makes the read system call return. The first time it returns with a non-zero value (if you typed something). The second time, it returns with 0, which is code for "end of file".

回答4:

The first time it considers it to be input, the second time it's for keeps!

This only occurs when the input is from a tty. It is likely because of the terminal settings where characters are buffered until a newline (carriage return) is entered.

回答5:

Using the "for line in file:" form of reading lines from a file, Python uses a hidden read-ahead buffer (see http://docs.python.org/2.7/library/stdtypes.html#file-objects at the file.next function). First of all, this explains why a program that writes output when each input line is read displays no output until you press CTRL-D. Secondly, in order to give the user some control over the buffering, pressing CTRL-D flushes the input buffer to the application code. Pressing CTRL-D when the input buffer is empty is treated as EOF.

Tying this together answers the original question. After entering some input, the first ctrl-D (on a line by itself) flushes the input to the application code. Now that the buffer is empty, the second ctrl-D acts as End-of-File (EOF).

file.readline() does not exhibit this behavior.

来源：https://stackoverflow.com/questions/2162914/why-do-i-have-to-press-ctrld-twice-to-close-stdin

标签

python

bash

stdin