Pipe vs. Temporary File

你说的曾经没有我的故事 提交于 2019-11-28 16:52:24

One big difference is that with the pipe, processes A and B can be running concurrently, so that B gets to work on the output from A before A has finished producing it. Further, the size of the pipe is limited, so A won't be able to produce vastly more data than B has consumed; it will be made to wait for B to catch up.

If the volume of data is big, then writing to the temporary file involves disk activity, even if only for creating and then destroying the file. The data might well stay in the in-memory buffer pools - so no disk I/O there - even for surprisingly large files. Writing to the pipe 'never' involves writing to disk.

The big difference is that the first method actually uses on-disk storage, whereas a pipe will use memory (unless you get really pedantic and start thinking about swap space).

Performance-wise, memory is faster than disk (almost always). This should be generally true for all operating systems.

The only time when using a temp file really makes sense is if process B has to examine the data in multiple passes (like certain kinds of video encoding). For this use, the whole data stream would need to be buffered and if there were enough data yes it would probably negate the in-memory advantage. So for multi-pass (seek-bound) operations, go with a temp file.

Unless my understanding of pipes in completely off the wall, the answer is YES.

Writing to a temp file involves disk access, and the associated overhead.

Writing to a pipe, and reading from it, happens in memory. Much faster.

I thought a practical answer might help. I'm speed-optimizing a script I use that has about 4 steps. I set it up to use piping and non-piping methods. This is under Windows 7 64-bit.

I got a 3% slowdown for not using piping. Which is worth it, for me, because now I can stop between each step and update the window title, which I couldn't when it was all one command.

Personally, I'll take that 3% hit for the window titles.

For curiosity, I am grepping a >20M file, then passing it to a specialized perl script that modifies the results, then sorting them using windows built in SORT.EXE, then uniq'ing them using cygwin's UNIQ.EXE, then re-grepping those same results to get ANSI-based grep-result-coloring. Most of the time is spent in the sorting phase.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!