Assuming the address space can cover the file, it appears to me that mmap simply allocates a chunk of memory as large as the file about to be read, and creates a 1-to-1 rela
I was curious about this so I tried benchmarking whole-file reads for files of
sizes 1, 2, 4, 8, etc., once with mmap (M) and once with read (R) (theoretically one call with the fstat-ed size, but it would retry if that call returned a partial result). After the reading/mmaping, a byte of each mmaped/read page was accessed in an non-optimizable fashion.
Here's my results:
Size M(µs) R(µs)
1 9.5 4.2
2 10.8 4.5
4 8.4 3.8
8 8.6 3.8
16 7.3 4
32 7.8 3.5
64 8.3 3.9
128 9.2 4.6
256 8.6 4.7
512 10.6 5.1
1.0Ki 9.8 4.7
2.0Ki 10.1 5.4
4.0Ki 10.5 5.6
8.0Ki 10.4 6.9
16Ki 9.9 10
32Ki 14.4 12.8
64Ki 16.1 23.7
128Ki 28.1 41.1
256Ki 34.5 82.4
512Ki 57.9 154.6
1.0Mi 103.5 325.8
2.0Mi 188.5 919.8
4.0Mi 396.3 1963.2
8.0Mi 798.8 3885
16Mi 1611.4 7660.2
32Mi 3207.4 23040.2
64Mi 6712.1 84491.9
It appears read is about twice as fast up to about 16Ki. From then on, mmap starts winning big time (for 64MiB files by a factor of 12).
(Tested on Linux with 3.19 on my laptop, 10^4 repeated reads to the same file.)