How can I allocate memory on Linux without overcommitting, so that malloc actually returns NULL
if no memory is available and the process doesn\'t randomly cras
How can I allocate memory on Linux without overcommitting
That is a loaded question, or at least an incorrect one. The question is based on an incorrect assumption, which makes answering the stated question irrelevant at best, misleading at worst.
Memory overcommitment is a system-wide policy -- because it determines how much virtual memory is made available to processes --, and not something a process can decide for itself.
It is up to the system administrator to determine whether memory is overcommitted or not. In Linux, the policy is quite tunable (see e.g. /proc/sys/vm/overcommit_memory
in man 5 proc. There is nothing a process can do during
allocation that would affect the memory overcommit policy.
OP also seems interested in making their processes immune to the out-of-memory killer (OOM killer) in Linux. (OOM killer in Linux is a technique used to relieve memory pressure, by killing processes, and thus releasing their resources back to the system.)
This too is an incorrect approach, because the OOM killer is a heuristic process, whose purpose is not to "punish or kill badly behaving processes", but to keep the system operational. This facility is also quite tunable in Linux, and the system admin can even tune the likelihood of each process being killed in high memory pressure situations. Other than the amount of memory used by a process, it is not up to the process to affect whether the OOM killer will kill it during out-of-memory situations; it too is a policy issue managed by the system administrator, and not the processes themselves.
I assumed that the actual question the OP is trying to solve, is how to write Linux applications or services that can dynamically respond to memory pressure, other than just dying (due to SIGSEGV or by the OOM killer). The answer to this is you do not -- you let the system administrator worry about what is important to them, in the workload they have, instead --, unless your application or service is one that uses lots and lots of memory, and is therefore likely to unfairly killed during high memory pressure. (Especially if the dataset is sufficiently large to require enabling much larger amount of swap than would otherwise be enabled, causing a higher risk of a swap storm and late-but-too-strong OOM killer.)
The solution, or at least the approach that works, is to memory-lock the critical parts (or even the entire application/service, if it works on sensitive data that should not be swapped to disk), or to use a memory map with a dedicated backing file. (For the latter, here is an example I wrote in 2011, that manipulates a terabyte-sized data set.)
The OOM killer can still kill the process, and a SIGSEGV still occur (due to say an internal allocation by a library function that the kernel fails to provide RAM backing to), unless all of the application is locked to RAM, but at least the service/process is no longer unfairly targeted, just because it uses lots of memory.
It is possible to catch the SIGSEGV signal (that occurs when there is no memory available to back the virtual memory), but thus far I have not seen an use case that would warrant the code complexity and maintenance effort required.
In summary, the proper answer to the stated question is no, don't do that.
From the discussion in the comments, it appears calling
mlockall( MCL_CURRENT | MCL_FUTURE );
upon process start would satisfy the requirement for malloc()
to return NULL
when the system can not actually provide memory.
Per the Linux mlockall() man page:
mlockall() and munlockall()
mlockall() locks all pages mapped into the address space of the calling process. This includes the pages of the code, data and stack segment, as well as shared libraries, user space kernel data, shared memory, and memory-mapped files. All mapped pages are guaranteed to be resident in RAM when the call returns successfully; the pages are guaranteed to stay in RAM until later unlocked.
The flags argument is constructed as the bitwise OR of one or more of the following constants:
MCL_CURRENT Lock all pages which are currently mapped into the address space of the process. MCL_FUTURE Lock all pages which will become mapped into the address space of the process in the future. These could be, for instance, new pages required by a growing heap and stack as well as new memory-mapped files or shared memory regions. MCL_ONFAULT (since Linux 4.4) Used together with MCL_CURRENT, MCL_FUTURE, or both. Mark all current (with MCL_CURRENT) or future (with MCL_FUTURE) mappings to lock pages when they are faulted in. When used with MCL_CURRENT, all present pages are locked, but mlockall() will not fault in non-present pages. When used with MCL_FUTURE, all future mappings will be marked to lock pages when they are faulted in, but they will not be populated by the lock when the mapping is created. MCL_ONFAULT must be used with either MCL_CURRENT or MCL_FUTURE or both.
If MCL_FUTURE has been specified, then a later system call (e.g., mmap(2), sbrk(2), malloc(3)), may fail if it would cause the number of locked bytes to exceed the permitted maximum (see below). In the same circumstances, stack growth may likewise fail: the kernel will deny stack expansion and deliver a SIGSEGV signal to the process.
Note that using mlockall()
in this manner might have other, unexpected consequences. Linux has been developed assuming memory overcommit is available, so something as simple as calling fork()
after mlockall()
might run into issues.