mmap slower than getline?

后端未结

关注

 4  930

小鲜肉 2020-12-29 14:17

I face the challenge of reading/writing files (in Gigs) line by line.

Reading many forum entries and sites (including a bunch of SO\'s), mmap was suggested as the f

4条回答

暖寄归人 (楼主)

2020-12-29 14:54

Whoever told you to use mmap does not know very much about modern machines.

The performance advantages of mmap are a total myth. In the words of Linus Torvalds:

Yes, memory is "slow", but dammit, so is mmap().

The problem with mmap is that every time you touch a page in the mapped region for the first time, it traps into the kernel and actually maps the page into your address space, playing havoc with the TLB.

Try a simple benchmark reading a big file 8K at a time usingread and then again with mmap. (Using the same 8K buffer over and over.) You will almost certainly find that read is actually faster.

Your problem was never with getting data out of the kernel; it was with how you handle the data after that. Minimize the work you are doing character-at-a-time; just scan to find the newline and then do a single operation on the block. Personally, I would go back to the read implementation, using (and re-using) a buffer that fits in the L1 cache (8K or so).

Or at least, I would try a simple read vs. mmap benchmark to see which is actually faster on your platform.

[Update]

I found a couple more sets of commentary from Mr. Torvalds:

http://lkml.iu.edu/hypermail/linux/kernel/0004.0/0728.html http://lkml.iu.edu/hypermail/linux/kernel/0004.0/0775.html

The summary:

And on top of that you still have the actual CPU TLB miss costs etc. Which can often be avoided if you just re-read into the same area instead of being excessively clever with memory management just to avoid a copy.

memcpy() (ie "read()" in this case) is always going to be faster in many cases, just because it avoids all the extra complexity. While mmap() is going to be faster in other cases.

In my experience, reading and processing a large file sequentially is one of the "many cases" where using (and re-using) a modest-sized buffer with read/write performs significantly better than mmap.

0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...