Image

Giampaolo Rodola: Detect memory leaks of C extensions with psutil and psleak

Memory leaks in Python are often straightforward to diagnose. Just look at RSS, track Python object counts, follow reference graphs. But leaks insideC extension modulesare another story. Traditional memory metrics such as RSS and VMS frequently fail to reveal them because Python's memory allocator sits above the platform's native heap (seepymalloc). If something in an extension callsmalloc()without a correspondingfree(), that memory often won't show up where you expect it. You have a leak, andyou don't know.

psutil 7.2.0 introduces two new APIs forC heap introspection, designed specifically to catch these kinds of native leaks. They give you a window directly into the underlying platform allocator (e.g. glibc's malloc), letting you track how much memory the C layer is actually consuming.

These C functions bypass Python entirely. They don't reflect Python object memory, arenas, pools, or anything managed bypymalloc. Instead, they examine the allocator that C extensions actually use. If your RSS is flat but your C heap usage climbs, you now have a way to see it.

Why native heap introspection matters

Many Python projects rely on C extensions: psutil, NumPy, pandas, PIL, lxml, psycopg, PyTorch, custom in-house modules, etc. And even cPython itself, which implements many of its standard library modules in C. If any of these components mishandle memory at the C level, you get a leak that:

  • Doesn't show up in Python reference counts (sys.getrefcount).
  • Doesn't show up intracemalloc module.
  • Doesn't show up in Python'sgcstats.
  • Often don't show up in RSS, VMS orUSSdue to allocator caching, especially for small objects. This can happen, for example, when you forget toPy_DECREFa Python object.

psutil's new functions solve this by inspecting platform-native allocator state, in a manner similar to Valgrind.

heap_info(): direct allocator statistics

heap_info()exposes the following metrics:

  • heap_used: total number of bytes currently allocated viamalloc()(small allocations).
  • mmap_used: total number of bytes currently allocated viammap()or via largemalloc()allocations.
  • heap_count: (Windows only) number of private heaps created viaHeapCreate().

Example:

Reference for what contributes to each field:

heap_trim(): returning unused heap memory

heap_trim()provides a cross-platform way to request that the underlying allocator free any unused memory it's holding in the heap (typically smallmalloc()allocations).

In practice, modern allocators rarely comply, so this is not a general-purpose memory-reduction tool and won't meaningfully shrink RSS in real programs. Its primary value is inleak detection tools.

Callingheap_trim()before taking measurements helps reduce allocator noise, giving you a cleaner baseline so that changes inheap_usedcome from the code you're testing, not from internal allocator caching or fragmentation.

Real-world use: finding a C extension leak

The workflow is simple:

  1. Take a baseline snapshot of the heap.
  2. Call the C extension hundreds of times.
  3. Take another snapshot.
  4. Compare.

Ifheap_usedormmap_usedvalues increase consistently, you've found a native leak.

To reduce false positives, repeat the test multiple times, increasing the number of calls on each retry. This approach helps distinguish real leaks from random noise or transient allocations.

A new tool: psleak

The strategy described above is exactly what I implemented in a new PyPI package, which I calledpsleak. It runs the target function repeatedly, trims the allocator before each run, and tracks differences across retries. Memory that grows consistently after several runs is flagged as a leak.

A minimal test suite looks like this:

If the function leaks memory, the test will fail with a descriptive exception:

Psleak is now part of the psutil test suite, to make sure that the C code does not leak memory. All psutil APIs are tested (seetest_memleaks.py), making it a de factoregression-testing tool.

It's worth noting that without inspecting heap metrics, missing calls such asPy_CLEARandPy_DECREFoften go unnoticed, because they don't affect RSS, VMS, and USS. Something I confirmed from experimenting by commenting them out. Monitoring the heap is therefore essential to reliably detect memory leaks in Python C extensions.

Under the hood

For those interested in seeing how I did this in terms of code:

  • Linux: uses glibc'smallinfo2()to reportuordblks(heap allocations) andhblkhd(mmap-backed blocks).
  • Windows: enumerates heaps and aggregatesHeapAlloc/VirtualAllocusage.
  • macOS: uses malloc zone statistics.
  • BSD: uses jemalloc's arena and stats interfaces.

Summary

psutil 7.2.0 fills a long-standing observability gap: native-level memory leaks in C extensions are now visible directly from Python. You now have a simple method totest C extensions for leaks. This turns psutil into not just a monitoring library, but a practical debugging tool for Python projects that rely on native C extension modules.

To make leak detection practical, I createdpsleak, a test-regression framework designed to integrate into Python unit tests.

References

Discussion

https://gmpy.dev/blog/2025/psutil-heap-introspection-apis