Python ctypes from_buffer mapping with context manager into memory mapped file (mmap)

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-11 04:37:35

问题


I'm using ctypes.from_buffer() to map a ctypes structure to a memory mapped file for some tasks. Typically, these files contain a concatenation of structured headers and binary data. The ctypes structure allows for a stable binary representation and easy pythonic access of the fields - a real winning team in this respect.

These memory mapped files grow dynamically over time. Apart from the complication of accepting growth in mmap.PAGESIZE granularity only, the mmap responds with allergic reactions, if some (hidden) references are kept into the mapped area during resize attempts..

That's where a context manager come into play.

# -*- coding: utf8 -*

import io
import mmap
import ctypes
import logging

log = logging.getLogger(__file__)

def align(size, alignment):
    """return size aligned to alignment"""
    excess = size % alignment
    if excess:
        size = size - excess + alignment
    return size

class CtsMap:
    def __init__(self, ctcls, mm, offset = 0):
        self.ctcls = ctcls
        self.mm = mm
        self.offset = offset

    def __enter__(self):
        mm = self.mm
        offset = self.offset
        ctsize = ctypes.sizeof(self.ctcls)
        if offset + ctsize > mm.size():
            newsize = align(offset + ctsize, mmap.PAGESIZE)
            mm.resize(newsize)
        self.ctinst = self.ctcls.from_buffer(mm, offset)
        log.debug('add mapping: %s', ctypes.addressof(self.ctinst))
        return self.ctinst

    def __exit__(self, exc_type, exc_value, exc_traceback):
        # free all references into mmap
        log.debug('remove mapping: %s', ctypes.addressof(self.ctinst))
        del self.ctinst
        self.ctinst = None

class MapFile:
    def __init__(self, filename):
        self._offset = 0
        try:
            mapsize = mmap.PAGESIZE
            self._fd = open(filename, 'x+b')
            self._fd.write(b'\0' * mapsize)
            self._created = True
        except FileExistsError:
            self._fd = open(filename, 'r+b')
            self._fd.seek(0, io.SEEK_END)
            mapsize = self._fd.tell()
        self._fd.seek(0)
        self._mm = mmap.mmap(self._fd.fileno(), mapsize)

    def add_data(self, data):
        datasize = len(data)
        log.debug('add_data: header')
        hdtype = ctypes.c_char * 4
        with CtsMap(hdtype, self._mm, self._offset) as hd:
            hd.raw = b'HEAD'
            self._offset += 4
        #del hd
        log.debug('add_data: %s', datasize)
        blktype = ctypes.c_char * datasize
        with CtsMap(blktype, self._mm, self._offset) as blk:
            blk.raw = data
            self._offset += datasize
        #del blk
        return 4 + datasize

    def size(self):
        return self._mm.size()

    def close(self):
        self._mm.close()
        self._fd.close()

if __name__ == '__main__':
    import sys

    logconfig = dict(
        level = logging.DEBUG,
        format = '%(levelname)5s: %(message)s',
    )
    logging.basicConfig(**logconfig)

    mapfile = sys.argv[1:2] or 'mapfile'
    datafile = sys.argv[2:3] or __file__

    data = open(datafile, 'rb').read()

    maxsize = 10 * mmap.PAGESIZE

    mf = MapFile(mapfile)
    while mf.size() < maxsize:
        mf.add_data(data)
    mf.close()

This code creates a fully mmapped file mapfile, and copies itself into this file a couple of times, with a header tag ('HEAD') prepended, just for illustration purposes. Usually, the structured header is slightly more complicated...

Running the code results in:

DEBUG: add_data: header
DEBUG: add mapping: 139989829832704
DEBUG: remove mapping: 139989829832704
DEBUG: add_data: 3275
DEBUG: add mapping: 139989829832708
DEBUG: remove mapping: 139989829832708
DEBUG: add_data: header
DEBUG: add mapping: 139989829835983
DEBUG: remove mapping: 139989829835983
DEBUG: add_data: 3275
Traceback (most recent call last):
  File "ctxmmapctypes.py", line 110, in <module>
    mf.add_data(data)
  File "ctxmmapctypes.py", line 78, in add_data
    with CtsMap(blktype, self._mm, self._offset) as blk:
  File "ctxmmapctypes.py", line 39, in __enter__
    mm.resize(newsize)
BufferError: mmap can't resize with extant buffers exported.

To get the code working, one has to remove the comments in front of the two del statements in MapFile.add_data.

Obviously, they are necessary, because the variable assigned in the with statement still exists in the local namespace, and that is enough to keep a reference into the mmap area, that mmap.resize() stumbles upon.

How can I get rid of these del statements, since they are a real PITA, and aren't context managers invented to avoid such contortions?

IOW, is there a way to remove these mappings more effectively? E.g. to revert the operation of ctypes.from_buffer() programatically in CtsMap.__exit__()?

A related issue, that further illustrates the ctypes.from_buffer, combined with mmaped files approach can be found here.

来源:https://stackoverflow.com/questions/41077696/python-ctypes-from-buffer-mapping-with-context-manager-into-memory-mapped-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!