mmap problem, allocates huge amounts of memory

前端 未结 8 796
野趣味
野趣味 2020-12-23 18:17

I got some huge files I need to parse, and people have been recommending mmap because this should avoid having to allocate the entire file in-memory.

But looking at

8条回答
  •  别那么骄傲
    2020-12-23 18:55

    You may have been offered the wrong advice.

    Memory mapped files (mmap) will use more and more memory as you parse through them. When physical memory becomes low, the kernel will unmap sections of the file from physical memory based on its LRU (least recently used) algorithm. But the LRU is also global. The LRU may also force other processes to swap pages to disk, and reduce the disk cache. This can have a severely negative affect on the performance on other processes and the system as a whole.

    If you are linearly reading through files, like counting the number of lines, mmap is a bad choice, as it will fill physical memory before release memory back to the system. It would be better to use traditional I/O methods which stream or read in a block at a time. That way memory can be released immediately afterwards.

    If you are randomly accessing a file, mmap is an okay choice. But it's not optimal since you would still be relying the kernel's general LRU algorithm, but it’s faster to use than writing your caching mechanism.

    In general, I would never recommend anyone use mmap, except for some extreme performance edge cases - like accessing the file from multiple processes or threads at the same time, or when the file is small in relationship to the amount of free available memory.

提交回复
热议问题