Boost mapped_file_source, alignment and page size

不问归期 提交于 2021-02-08 02:11:53

问题


I'm trying to parse some text files of size up to a few hundred megabytes in a context where performance is important, so I'm using boost mapped_file_source. The parser expects the source to be terminated with a null byte, so I want to check whether the file size is an exact multiple of the page size (and if so, fall back on a slower, non-memory mapped method). I thought I could do this with:

if (mf.size() & (mf.alignment() - 1))

But it turns out on one test file with size 20480, the alignment is 65536 (on Windows 7, 64 bit) and the program is crashing. I think what's going on is that the page size is actually smaller than the alignment, so my test isn't working.

How can I get the page size? Or is there something else I should be doing instead? (I need solutions for both Windows and Linux, willing to write system specific code if necessary but would prefer portable code if possible.)


回答1:


The simplest thing to do seems fixing the parser to take the end of the input into account (not too outrageous, really).

Next up: a big warning. Relying on trailing bytes in the map (if any) to be zero is undefined¹: http://pubs.opengroup.org/onlinepubs/9699919799/functions/mmap.html

So, just map the file using size+1, and deterministically add the NUL terminator. I don't think this is worth getting into platform specific/undefined behaviour for.

In fact I just learned of boost::iostreams::mapped_file_base::mapmode::priv, which is perfect for your needs:

A file opened with private access can be written to, but the changes will not affect the underlying file [docs]

Here's a simple snippet: Live On Coliru

#include <boost/iostreams/device/mapped_file.hpp>
#include <fstream>
#include <iostream>

namespace io = boost::iostreams;

int main() {
    // of course, prefer `stat(1)` or `boost::filesystem::file_size()`, but for exposition:
    std::streamsize const length = std::distance(std::istreambuf_iterator<char>(std::ifstream("main.cpp").rdbuf()), {});

    io::mapped_file mf("main.cpp", io::mapped_file_base::mapmode::priv, length+1);

    *(mf.end()-1) = '\0'; // voilà, null termination done, safely, quickly and reliably

    std::cout << length << "\n";
    std::cout << mf.size() << "\n";
}

Alternative spellings:

mf.data()[length] = '\0'; // voilà, null termination done, safely, quickly and reliably
*(mf.begin()+length) = 0; // etc.

¹ AFAICT it might kill a bunny or crash your process.



来源:https://stackoverflow.com/questions/26614516/boost-mapped-file-source-alignment-and-page-size

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!