问题
I have a python script that uses multiprocessing.pool.map
to do some work. As it goes it prints things to stdout
, for errors it prints to stderr
. I decided it would be nice to have a separate log file for each of the streams and after a bit of thinking worked out that I should run it like this:
time ./ecisSearch.py 58Ni.conf 4 1 > >(tee stdout.log) 2> >(tee stderr.log >&2)
This gives me the log files and preserves the output on the appropriate streams. However here comes the problem. If I run it without the redirects I get this:
$ time ./ecisSearch.py 58Ni.conf 4 1
2015-01-09 14:42:37.524333: This job will perform 4 fit(s) //this is stdout
2015-01-09 14:42:37.524433: Threaded mapping of the fit function onto the starting point input set is commencing //this is stdout
2015-01-09 14:42:37.526641: Starting run #: 0 //this is stdout
2015-01-09 14:42:37.527018: Starting run #: 1 //this is stdout
2015-01-09 14:42:37.529124: Starting run #: 2 //this is stdout
2015-01-09 14:42:37.529831: Starting run #: 3 //this is stdout
2015-01-09 14:42:54.052522: Test of std err writing in run 0 is finished //this is stderr
2015-01-09 14:42:54.502284: Test of std err writing in run 1 is finished //this is stderr
2015-01-09 14:42:59.952433: Test of std err writing in run 3 is finished //this is stderr
2015-01-09 14:43:03.259783: Test of std err writing in run 2 is finished //this is stderr
2015-01-09 14:43:03.260360: Finished fits in job #: 1 preparing to output data to file //this is stdout
2015-01-09 14:43:03.275472: Job finished //this is stdout
real 0m26.001s
user 0m44.145s
sys 0m32.626s
However, running it with the redirects generates the following output.
$ time ./ecisSearch.py 58Ni.conf 4 1 > >(tee stdout.log) 2> >(tee stderr.log >&2)
2015-01-09 14:55:13.188230: Test of std err writing in run 0 is finished //this is stderr
2015-01-09 14:55:13.855079: Test of std err writing in run 1 is finished //this is stderr
2015-01-09 14:55:19.526580: Test of std err writing in run 3 is finished //this is stderr
2015-01-09 14:55:23.628807: Test of std err writing in run 2 is finished //this is stderr
2015-01-09 14:54:56.534790: Starting run #: 0 //this is stdout
2015-01-09 14:54:56.535162: Starting run #: 1 //this is stdout
2015-01-09 14:54:56.538952: Starting run #: 3 //this is stdout
2015-01-09 14:54:56.563677: Starting run #: 2 //this is stdout
2015-01-09 14:54:56.531837: This job will perform 4 fit(s) //this is stdout
2015-01-09 14:54:56.531912: Threaded mapping of the fit function onto the starting point input set is commencing //this is stdout
2015-01-09 14:55:23.629427: Finished fits in job #: 1 preparing to output data to file //this is stdout
2015-01-09 14:55:23.629742: Job finished //this is stdout
real 0m27.376s
user 0m44.661s
sys 0m33.295s
Just looking at the time stamps we can see something strange is happening here. Not only are the stderr
and stdout
streams not interspersed with each other as they should be, but the stdout
component seems to have stuff from the sub-processes first and then stuff from the 'master' process, regardless of the order it appeared in. I know that stderr
is unbuffered and stdout
is buffered, but that does not explain why the stdout
information is out of order within its own stream. Also, not apparent from my posting, is the fact that all the stdout
waited until the end of execution to appear on the screen.
My questions are as follows: Why is this happening? and, less importantly Is there a way to fix it?
回答1:
The output to stdout is buffered: that is, print statements actually write to a buffer, and this buffer is only occassionally flushed to the terminal. Each process has a separate buffer, which is why writes from different processes can appear out of order (This is a common problem, as in Why subprocess stdout to a file is written out of order?)
In this case, the output is in order, but appears out of order when it is redirected. Why? This article explains:
- stdin is always buffered
- stderr is always unbuffered
- if stdout is a terminal then buffering is automatically set to line buffered, else it is set to buffered
So, when output was going to a terminal, it was flushing every line, and happened to appear in order. When redirecting, a long buffer is used (typically 4096 bytes). Since you printed less than that, whichever subprocess finished first was flushed first.
The solution is to use flush()
, or entirely disable buffering for the process (see Disable output buffering)
来源:https://stackoverflow.com/questions/27868312/correcting-out-of-order-printing-from-stream-redirection