mmap vs O_DIRECT for random reads (what are the buffers involved?)

匿名 (未验证) 提交于 2019-12-03 08:48:34

问题:

I am implementing a disk based hashtable supporting large amount of keys (26+ million). The value is deserialized. Reads are essentially random throughout the file, values are less than the page size, and I am optimising for SSDs. Safety/consistency are not such huge issues (performance matters).

My current solution involves using a mmap() file with MADV_RANDOM | MADV_DONTNEED set to disable prefetching by the kernel and only load data as needed on-demand.

I get the idea that the kernel reads from disk to memory buffer, and I deserialize from there.

What about O_DIRECT? If I call read(), I'm still copying into a buffer (which I deserialize from) so can I gain any advantage?

Where can I find more info on the buffers involved with a mmap() file and calling read() on a file opened with O_DIRECT?

I am not interested in read ahead or caching. It has nothing to offer for my use case.

回答1:

O_DIRECT is option for read/write operations, when data bypass system buffers, and copied directlty from your buffer to disk controller. For get advantages of O_DIRECT, need to comply some conditions - keep aligned by memory page buffer address and buffer size aligned by I/O block.

Anyway, if you use mmap, you do not use read/write. Moreover, after mmap, you can close file descriptor, and mapping will still works. So, O_DIRECT useless with mmap option.

What can I recommend for increase performance:

  1. If your subsystem has many request for search missing key, you can create Bloom filter in the memory. Thereafter, you match your search key on Bloom filter http://en.wikipedia.org/wiki/Bloom_filter, and reject missing keys, without actual request to disk.

  2. For conserve memory, use 2-level scheme, when bucket heads you keep in the mmap-ed memory, but buckets itself you read from file by pread().

Both options I implemented in the my autocomplete subsytem, you can see it online here: http://olegh.ftp.sh/autocomplete.html and estimate performance on the slow old computer - Celeron-300.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!