suggestions for improving an allocator algorithm implementation

流过昼夜 提交于 2019-12-05 06:42:35

For the fixed sized objects, you can create a fixed sized allocator. Basically you allocate blocks, partition into subblocks of the appropriate size and create a linked list with the result. Allocating from such a block is O(1) if there is memory available (just remove the first element from the list and return a pointer to it) as is deallocation (add the block to the free list). During allocation, if the list is empty, grab a new superblock, partition and add all blocks into the list.

For the variable sized list, you can simplify it to the fixed size block by allocating only blocks of known sizes: 32 bytes, 64 bytes, 128 bytes, 512 bytes. You will have to analyze the memory usage to come up with the different buckets so that you don't waste too much memory. For large objects, you can go back to a dynamic size allocation pattern, that will be slow, but hopefully the amount of large objects is limited.

Can you have a separate memory allocation pool for each different fixed-size type you are allocating? That way there won't be any fragmentation, because the allocated objects will always align on n-byte boundaries. That doesn't help for the variable-length strings, of course.

There's an example of small-object allocation in Alexandrescu's Modern C++ design that illustrates this principle and may give you some ideas.

Building on Tim's answer, I would personally use something akin to BiBOP.

The basic idea is simple: use fixed-size pools.

There are some refinements to that.

First, the size of the pools is generally fixed. It depends on your allocation routine, typically if you know the OS you're working on map at least 4KB at once when you use malloc, then you use that value. For a memory mapped file, you might be able to increase this.

The advantage of fixed-size pools is that it nicely fights off fragmentation. All pages being of the same size, you can recycle an empty 256-bytes page into a 128-bytes page easily.

There is still some fragmentation for large objects, which are typically allocated outside of this system. But it's low, especially if you fit large objects into a multiple of the page size, this way the memory will be easy to recycle.

Second, how to handle the pools ? Using linked-lists.

The pages are typically untyped (by themselves), so you have a free-list of pages in which to prepare new pages and put "recycled" pages.

For each size category you then have a list of "occupied" pages, in which memory has been allocated. For each page you keep:

  • the allocation size (for this page)
  • the number of allocated objects (to check for emptiness)
  • a pointer to the first free cell
  • a pointer to the previous and next pages (might point to the "head" of the list)

Each free-cell is itself a pointer (or index, depending on the size you have) to the next free-cell.

The list of "occupied" pages of a given size is simply managed:

  • on deletion: if you empty the page, then remove it from the list and push it into the recycled pages, otherwise, update the free-cell list of this page (note: finding the beginning of the current page is usually a simple modulo operation on the address)
  • on insertion: search starting from head, as soon as you find a non-full page, move it in front of the list (if it's not already) and insert your item

This scheme is really performant memory-wise, with only a single page reserved for indexing.

For multi-threaded / multi-processes applications, you'll need to add synchronization (a mutex per page typically), in case you could get inspiration from Google's tcmalloc (try and find another page instead of blocking, use a thread-local cache to remember which page you last used).

Having said that, have you tried Boost.Interprocess ? It provides allocators.

For the fixed sizes you can easily use a small memory allocator type of allocator where you allocate a large block that's split into fixed-size chunks. You then create a vector of pointers to available chunks and pop/push as you allocate/free. This is very fast.

For variable length items, it's harder: You either have to deal with searching for available contiguous space or use some other approach. You might consider maintaining another map of all the free nodes ordered by block size, so you can lower_bound the map and if the next available node is say only 5% too big return it instead of trying to find usable available space of the exact size.

My inclination for variable-sized items would be to, if practical, avoid holding direct pointers to data and instead keep handles. Each handle would be a the index of a superblock, and an index to an item within the superblock. Each superblock would have an item-list allocated top-down and items allocated bottom-up. Each item's allocation would be preceded by its length, and by the index of the item it represents; use one bit of the index to indicate whether an item is 'pinned'.

If an item fits after the last allocated item, simply allocate it. If it would hit a pinned item, move the next-allocation mark past the pinned item, find the next higher pinned item, and try the allocation again. If the item would collide with the item-list but there's enough free space somewhere, compactify the block's contents (if one or more items are pinned, it may be better to use another superblock if one is available). Depending upon usage patterns, it may be desirable to start by only compactifying the stuff that was added since the last collection; if that doesn't provide enough space, then compactify everything.

Of course, if only have a few discrete sizes of items, you can use simple fixed-sized-chunk allocators.

I agree with Tim - use memory pools to avoid fragmentation.

However you may be able to avoid some of the churn by storing pointers rather than objects in your vectors, perhaps ptr_vector?

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!