Fastest way to read large binary file in Haskell?

前端 未结 2 1554
悲哀的现实
悲哀的现实 2020-12-17 23:20

I want to process a binary file that is too large to read into memory. Currently I use ByteString.Lazy.readFile to stream the bytes. I thought it would be a good idea to use

2条回答
  •  清歌不尽
    2020-12-17 23:56

    To elaborate on @Cubic's comment, while there's a general consensus that lazy I/O should be avoided in production code and replaced with a streaming approach, this is not directly related to performance. If you're writing a program to do some one-off processing of a large file, as long as you have a lazy I/O version running fine now, there's probably no good performance reason to convert it over to a streaming package.

    In fact, streaming is more likely to add some overhead, so I suspect that a well optimized lazy I/O solution would out-perform a well optimized streaming solution, in most cases.

    The main reasons for avoiding Lazy I/O have been previously discussed on SO. In a nutshell, lazy I/O makes it difficult to consistently manage resources (e.g., file handles and network sockets), makes it hard to reason about space usage (e.g., a small program change can cause your memory usage to explode), and is occasionally "unsafe" if the timing and ordering of the I/O in question matters (usually not a problem if you're just reading in one set of files and/or writing out another set of files).

    Short-running utility programs for reading and/or writing large files are probably good candidates to be written in a lazy I/O style. As long as they don't have any obvious space leaks when they're run, they're probably fine.

提交回复
热议问题