Heap Memory and Slab allocation

问题

I'm confused regarding heap and free list. I have a few questions and I have my own understanding of how malloc works in C. Please correct me if I'm wrong.

Is the heap memory organized as a linked list (free list) of data blocks ?
Is there a difference between heap memory and free list ?

My understanding of storage allocation (open for improvement) :- When we call malloc, it allocates memory in the heap, and it does so by picking a data block of suitable size from the free list, right ?

When a certain block of memory is returned by malloc, it is removed from the free-list and the physical address of that block of memory is updated in the page table.

When the memory is free'd using free(), the data block is inserted back in the free-list, and possibly , to reduce fragmentation, conjoined with neighbouring block, and the present bit in the page table entry is cleared.

So the entire heap is a free-list(linked list of free blocks) + allocated data blocks .

Is that a comprehensive picture of storage allocation ?

EDIT : From Linux Kernel Development (Robert Love) Chapter on memory Management , Slab allocation

"A free list contains a block of available, already allocated, data structures. When code requires a new instance of a data structure, it can grab one of the structures off the free list rather than allocate the sufficient amount of memory and set it up for the data structure. Later, when the data structure is no longer needed, it is returned to the free list instead of deallocated. In this sense, the free list acts as an object cache, caching a frequently used type of object."

Free-list is mentioned as a "block of available, allocated data structure."

How is it allocated, when it is in a free-list ?
And how is returning a block of memory to free list _not_ the same as deallocating that block ?
How is slab allocation different from storage allocation

回答1:

malloc() isn't really related to the page table; it allocates virtual addresses, and the kernel is responsible for keeping track of where the pages are actually stored in physical RAM or on disk.

malloc() interacts with the kernel via the brk() system call, which asks the kernel to allocate more pages to the process, or releases pages back to the kernel. So there are really two levels of memory allocation:

The kernel allocates pages to a process, making those pages unavailable for use by other processes. From the kernel's standpoint, the pages can be located anywhere and their locations are tracked by the page table, but from the process's standpoint, it's a contiguous virtual address space. The "program break" that brk() manipulates is the boundary between addresses that the kernel will let you access (because they correspond to allocated pages) and addresses that will cause a segmentation fault if you try to access them.
malloc() allocates variable-sized portions of the program's data segment for use by the program. When there's not enough free space within the current size of the data segment, it uses brk() to get more pages from the kernel, making the data segment bigger. When it finds that some space at the end of the data segment is unused, it uses brk() to give the unused pages back to the kernel, making the data segment smaller.

Note that pages can be allocated to a process (by the kernel) even if the program running in that process isn't actually using the pages for anything. If you free() a block of memory that's located in the middle of your data segment, the implementation of free() can't use brk() to shrink the data segment because there are still other allocated blocks at higher addresses. So the pages remain allocated to your program from the kernel standpoint, even though they're "free space" from the malloc() standpoint.

Your description of how a free list works sounds right to me, though I'm no expert in how memory allocators are implemented. But the quote you posted from Robert Love sounds like it's talking about memory allocation within the Linux kernel, which is unrelated to memory allocation by malloc() within a userspace process. That kind of free list probably works differently.

回答2:

The first thing you want to do is differentiate between kernel and program allocation. Just like @Wyzard said, malloc uses brk (sbrk and sometimes mmap) to get more pages from the kernel . My understanding of malloc is nto very good, but it tracks what you might call an arena. It mediates access to memory and will make the proper system calls to allocate memory as necessary.

A free list is one way to manage memory. Calling mmap or brk every time you need more memory from the kernel is slow and inefficient. Both of these will require a context switch into kernel mode and will allocate data structures to track what process owns memory. A free list at user level is an optimization to not constantly ask for and return memory to the kernel. The goal of a user program is to do its work, not wait for the kernel to do its work.

Now to answer your other questions:

How is it allocated, when it is in a free-list?

Another way to think of a free-list is as a cache. A program will make requests and the kernel will try to fulfill them on-demand rather than all at once. However, when a program is done with a piece of memory, the fast path is not to return this to the kernel, but put it somewhere safe to be allocated again. In particular, a free-list can track the memory regions that an allocator can pull from, but do so in such a way (with memory protection) that the user code can't just co-opt it and start writing over all of it.

And how is returning a block of memory to free list not the same as deallocating that block?

If we assume that truly deallocating a block requires returning this to the kernel and having it update its internal page tables, then the difference really lies in what has control over the underlying physical page (or frame). Actually deallocating and returning the memory to the kernel means now the kernel can pull from these pages, whereas returning it to a user level free-list means some part of the program still controls that memory.

How is slab allocation different from storage allocation?

This is starting to get some a different area (that I'm not terribly familiar with). The slab allocator is one way the kernel manages memory allocation. From what I've seen, the slab tries to group allocations into different sizes and has a pool of pages for satisfying those requests. I believe common x86 architectures will allow allocations of continuous, physical memory in powers of two from 16 bytes to 4 MB (though I'm on a 64-bit machine). I believe there is some concept of a free-list at this level, as well as ways to efficient upgrade or downgrade allocation size to allow for varying system needs.

On the other hand, storage allocation sounds like making sure there is space on your hard disk. I actually haven't heard the term so I can only speculate.

回答3:

Here you are referring to two different allocators,

Buddy system allocator, used for allocating pages to Zone, uses the free_list to store free pages, allocate them , after freeing , if possible combine them back to one bigger contiguous page size of higher order.
Slab allocator which works on a already allocated, data structures like keme_cache,kmalloc calls.

refer Heap Memory in C Programming for heap.

For a c program in user space, we have heap memory in the call_stack where the malloc takes place. This is marked by _break which is advanced by sbrk() when more memory is needed.

In Linux kernel each process has task_struct which has its own stack and pointer to the list of pages used by it.

来源：https://stackoverflow.com/questions/10116033/heap-memory-and-slab-allocation

标签

Linux

memory-management

malloc

heap