问题
I'm trying to track down a memory leak, so I've done
import tracemalloc
tracemalloc.start()
<function call>
# copy pasted this from documentation
snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')
print("[ Top 10 ]")
for stat in top_stats[:10]:
print(stat)
This shows no major allocations, all memory allocations are pretty small, while I'm seeing 8+ GB memory allocated in ps
and pmap
(checking before and after running the command, and after running garbage collection). Furthermore, tracemalloc.get_traced_memory
confirms that tracemalloc
is not seeing many allocations. pympler
also does not see the allocations.
Does anyone know when this could be the case? Some modules are using cython, could this cause issues for tracemalloc?
In pmap the allocation looks like:
0000000002183000 6492008 6491876 6491876 rw--- [ anon ]
回答1:
From the documentation on tracemalloc:
The tracemalloc module is a debug tool to trace memory blocks allocated by Python.
In other words, memory not allocated by the python interpreter is not seen by tracemalloc. This would include anything not done by PyMalloc
at the C-API level, including all standard libc malloc
calls by native code used via extensions, or extension code using malloc
directly.
Whether that is the case here is impossible to tell for certain without code to reproduce. You can try running the native code part outside of python through, for example, valgrind, to detect memory leaks in the native code.
If there is cython code doing malloc
, that could be switched to PyMalloc
to have it traced.
回答2:
An addition to @danny's answer, because it is too long for a comment.
As explained in PEP-464, tracemalloc
uses functionality introduced in PEP-445 for tracking of the memory allocations.
Normally, one would have to use PyMem_RawMalloc instead of malloc
in order to be able to use tracemalloc
for a C-extension. However, since quite some time also using PyTraceMalloc_Track and PyTraceMalloc_Untrack from pymem.h as addition to malloc
(instead of replacing it by PyMem_RawMalloc
).
This is for example what is used in numpy, because in order to be able to wrap raw-c-pointers and take over its ownership numpy used malloc
rather than the python-allocator, which is optimized for small objects - not the most crucial scenario for numpy, as can be seen here:
/*NUMPY_API
* Allocates memory for array data.
*/
NPY_NO_EXPORT void *
PyDataMem_NEW(size_t size)
{
void *result;
result = malloc(size);
if (_PyDataMem_eventhook != NULL) {
NPY_ALLOW_C_API_DEF
NPY_ALLOW_C_API
if (_PyDataMem_eventhook != NULL) {
(*_PyDataMem_eventhook)(NULL, result, size,
_PyDataMem_eventhook_user_data);
}
NPY_DISABLE_C_API
}
PyTraceMalloc_Track(NPY_TRACE_DOMAIN, (npy_uintp)result, size);
return result;
}
So basically, it is a responsibility of the C-extension to report memory allocations to the tracemalloc
-module, on the other hand tracemalloc
cannot be really trusted to register all memory allocations.
来源:https://stackoverflow.com/questions/50148554/when-would-the-python-tracemalloc-module-allocations-statistics-not-match-whats