How can I read process output that has not been flushed?

问题

Consider this little programm be compiled as application.exe

#include <stdio.h>

int main()
{
    char str[100];
    printf ("Hello, please type something\n");
    scanf("%[^\n]s", &str);
    printf("you typed: %s\n", str);
    return 0;
}

Now I use this code to start application.exe and fetch its output.

#include <stdio.h>
#include <iostream>
#include <stdexcept>

int main()
{
    char buffer[128];
    FILE* pipe = popen("application.exe", "r");
    while (!feof(pipe)) {
        if (fgets(buffer, 128, pipe) != NULL)
            printf(buffer);
    }
    pclose(pipe);
    return 0;
}

My problem is that there is no output until I did my input. Then both output lines get fetched. I can workarround this problem by adding this line after the first printf statement.

fflush(stdout);

Then the first line is fetched before I do my input as expected.

But how can I fetch output of applications that I cannot modify and that do not use fflush() in "realtime" (means before they exit)? . And how does the windows cmd do it?

回答1:

You have been bitten by the fact that the buffering for the streams which are automatically opened in a C program changes with the type of device attached.

That's a bit odd — one of the things which make *nixes nice to play with (and which are reflected in the C standard library) is that processes don't care much about where they get data from and where they write it. You just pipe and redirect around at your leisure and it's usually plug and play, and pretty fast.

One obvious place where this rule breaks is interaction; you present a nice example. If the output of the program is block buffered you don't see it before maybe 4k data has accumulated, or the process exits.

A program can detect though whether it writes to a terminal via isatty() (and perhaps through other means as well). A terminal conceptually includes a user, suggesting an interactive program. The library code opening stdin and stdout checks that and changes their buffering policy to line buffered: When a newline is encountered, the stream is flushed. That is perfect for interactive, line oriented applications. (It is less than perfect for line editing, as bash does, which disables buffering completely.)

The open group man page for stdin is fairly vague with respect to buffering in order to give implementations enough leeway to be efficient, but it does say:

the standard input and standard output streams are fully buffered if and only if the stream can be determined not to refer to an interactive device.

That's what happens to your program: The standard library sees that it is running "non-interactively" (writing to a pipe), tries to be smart and efficient and switches on block buffering. Writing a newline does not flush the output any longer. Normally that is a good thing: Imagine writing binary data, writing to disk every 256 bytes, on average! Terrible.

It is noteworthy to realize that there is probably a whole cascade of buffers between you and, say, a disk; after the C standard library comes the operating system's buffers, and then the disk's proper.

Now to your problem: The standard library buffer used to store characters-to-be-written is in the memory space of the program. Despite appearances, the data has not yet left your program and hence is not (officially) accessible by other programs. I think you are out of luck. You are not alone: Most interactive console programs will perform badly when one tries to operate them through pipes.

回答2:

IMHO, that is one of the less logical parts of IO buffering: it acts differently when directed to a terminal or to a file or pipe. If IO is directed to a file or a pipe, it is normally buffered, that means that output is actually written only when a buffer is full or when an explicit flush occurs => that is what you see when you execute a program through popen.

But when IO is directed to a terminal, a special case occurs: all pending output is automatically flushed before a read from the same terminal. That special case is necessary to allow interactive programs to display prompts before reading.

The bad thing is that if you try to drive an interactive application through pipes, you loose: the prompts can only be read when either the application ends or when enough text was output to fill a buffer. That's the reason why Unix developpers invented the so called pseudo ttys (pty). They are implemented as terminal drivers so that the application uses the interactive buffering, but the IO is in fact manipulated by another program owning the master part of the pty.

Unfortunately, as you write application.exe, I assume that you use Windows, and I do not know an equivalent mechanism in the Windows API. The callee must use unbuffered IO (stderr is by default unbuffered) to allow the prompts to be read by a caller before it sends the answer.

回答3:

The problems of my question in my original post are already very good explained in the other answers.
Console applications use a function named isatty() to detect if their stdout handler is connected to a pipe or a real console. In case of a pipe all output is buffered and flushed in chunks except if you directly call fflush(). In case of a real console the output is unbuffered and gets directly printed to the console output.
In Linux you can use openpty() to create a pseudoterminal and create your process in it. As a result the process will think it runs in a real terminal and uses unbuffered output.
Windows seems not to have such an option.

After a lot of digging through winapi documentation I found that this is not true. Actually you can create your own console screen buffer and use it for stdout of your process that will be unbuffered then.
Sadly this is not a very comfortable solution because there are no event handler and we need to poll for new data. Also at the moment I'm not sure how to handle scrolling when this screen buffer is full.
But even if there are still some problems left I think I have created a very useful (and interesting) starting point for those of you who ever wanted to fetch unbuffered (and unflushed) windows console process output.

#include <windows.h>
#include <stdio.h>

int main(int argc, char* argv[])
{
    char cmdline[] = "application.exe"; // process command
    HANDLE scrBuff;                     // our virtual screen buffer
    CONSOLE_SCREEN_BUFFER_INFO scrBuffInfo; // state of the screen buffer
                                            // like actual cursor position
    COORD scrBuffSize = {80, 25};       // size in chars of our screen buffer
    SECURITY_ATTRIBUTES sa;             // security attributes
    PROCESS_INFORMATION procInfo;       // process information
    STARTUPINFO startInfo;              // process start parameters
    DWORD procExitCode;                 // state of process (still alive)
    DWORD NumberOfCharsWritten;         // output of fill screen buffer func
    COORD pos = {0, 0};                 // scr buff pos of data we have consumed
    bool quit = false;                  // flag for reading loop

    // 1) Create a screen buffer, set size and clear

    sa.nLength = sizeof(sa);
    scrBuff = CreateConsoleScreenBuffer( GENERIC_READ | GENERIC_WRITE,
                                         FILE_SHARE_READ | FILE_SHARE_WRITE,
                                         &sa, CONSOLE_TEXTMODE_BUFFER, NULL);
    SetConsoleScreenBufferSize(scrBuff, scrBuffSize);
    // clear the screen buffer
    FillConsoleOutputCharacter(scrBuff, '\0', scrBuffSize.X * scrBuffSize.Y,
                               pos, &NumberOfCharsWritten);

    // 2) Create and start a process
    //      [using our screen buffer as stdout]

    ZeroMemory(&procInfo, sizeof(PROCESS_INFORMATION));
    ZeroMemory(&startInfo, sizeof(STARTUPINFO));
    startInfo.cb = sizeof(STARTUPINFO);
    startInfo.hStdOutput = scrBuff;
    startInfo.hStdError = GetStdHandle(STD_ERROR_HANDLE);
    startInfo.hStdInput = GetStdHandle(STD_INPUT_HANDLE);
    startInfo.dwFlags |= STARTF_USESTDHANDLES;
    CreateProcess(NULL, cmdline, NULL, NULL, FALSE,
                  0, NULL, NULL, &startInfo, &procInfo);    
    CloseHandle(procInfo.hThread);

    // 3) Read from our screen buffer while process is alive

    while(!quit)
    {
        // check if process is still alive or we could quit reading
        GetExitCodeProcess(procInfo.hProcess, &procExitCode);
        if(procExitCode != STILL_ACTIVE) quit = true;

        // get actual state of screen buffer
        GetConsoleScreenBufferInfo(scrBuff, &scrBuffInfo);

        // check if screen buffer cursor moved since
        // last time means new output was written
        if (pos.X != scrBuffInfo.dwCursorPosition.X ||
            pos.Y != scrBuffInfo.dwCursorPosition.Y)            
        {
            // Get new content of screen buffer
            //  [ calc len from pos to cursor pos: 
            //    (curY - posY) * lineWidth + (curX - posX) ]
            DWORD len =  (scrBuffInfo.dwCursorPosition.Y - pos.Y)
                        * scrBuffInfo.dwSize.X 
                        +(scrBuffInfo.dwCursorPosition.X - pos.X);
            char buffer[len];
            ReadConsoleOutputCharacter(scrBuff, buffer, len, pos, &len);

            // Print new content
            // [ there is no newline, unused space is filled with '\0'
            //   so we read char by char and if it is '\0' we do 
            //   new line and forward to next real char ]
            for(int i = 0; i < len; i++)
            {
                if(buffer[i] != '\0') printf("%c",buffer[i]);
                else
                {
                    printf("\n");
                    while((i + 1) < len && buffer[i + 1] == '\0')i++;
                }
            }

            // Save new position of already consumed data
            pos = scrBuffInfo.dwCursorPosition;
        }
        // no new output so sleep a bit before next check
        else Sleep(100);
    }

    // 4) Cleanup and end

    CloseHandle(scrBuff);   
    CloseHandle(procInfo.hProcess);
    return 0;
}

回答4:

You can't. Because not yet flushed data is owned by the program itself.

回答5:

I think you can flush data to stderr or encapsulate a function of fgetc and fungetc to not corrupt the stream or use system("application.ext >>log") and then mmap log to memory to do things you want.

来源：https://stackoverflow.com/questions/39389017/how-can-i-read-process-output-that-has-not-been-flushed

标签

c++

winapi

process

stdout

unbuffered