Pipe vs. Temporary File

Is there a big performance difference between:

Process A writing to a temp file, and process B reading that file
Process A writing to a pipe, and process B reading from that pipe

I'm curious to know what the answer is for both Windows and *nix.

EDIT: I should have asked: Does the buffer cache eliminate the difference between a temp file and a pipe?

One big difference is that with the pipe, processes A and B can be running concurrently, so that B gets to work on the output from A before A has finished producing it. Further, the size of the pipe is limited, so A won't be able to produce vastly more data than B has consumed; it will be made to wait for B to catch up.

If the volume of data is big, then writing to the temporary file involves disk activity, even if only for creating and then destroying the file. The data might well stay in the in-memory buffer pools - so no disk I/O there - even for surprisingly large files. Writing to the pipe 'never' involves writing to disk.

The big difference is that the first method actually uses on-disk storage, whereas a pipe will use memory (unless you get really pedantic and start thinking about swap space).

Performance-wise, memory is faster than disk (almost always). This should be generally true for all operating systems.

The only time when using a temp file really makes sense is if process B has to examine the data in multiple passes (like certain kinds of video encoding). For this use, the whole data stream would need to be buffered and if there were enough data yes it would probably negate the in-memory advantage. So for multi-pass (seek-bound) operations, go with a temp file.

Unless my understanding of pipes in completely off the wall, the answer is YES.

Writing to a temp file involves disk access, and the associated overhead.

Writing to a pipe, and reading from it, happens in memory. Much faster.

I thought a practical answer might help. I'm speed-optimizing a script I use that has about 4 steps. I set it up to use piping and non-piping methods. This is under Windows 7 64-bit.

I got a 3% slowdown for not using piping. Which is worth it, for me, because now I can stop between each step and update the window title, which I couldn't when it was all one command.

Personally, I'll take that 3% hit for the window titles.

For curiosity, I am grepping a >20M file, then passing it to a specialized perl script that modifies the results, then sorting them using windows built in SORT.EXE, then uniq'ing them using cygwin's UNIQ.EXE, then re-grepping those same results to get ANSI-based grep-result-coloring. Most of the time is spent in the sorting phase.

来源：https://stackoverflow.com/questions/6977561/pipe-vs-temporary-file

标签

pipe

temporary-files