Enhance ref-trace facilities

The current main does not compile with --with-trace-refs on.

During compilation, the compiler warns about the inconsistency of the definition of PyObject in different translation units (when defining immortal objects).


This is one thing to demonstrate that the ref trace part has not been maintained or tested for a long time.

Thus, we currently only have sys.gettotalrefcount in debug API as a debugging tool, which is far from enough. Memory leaks are so common and so hard to capture. In most cases, we don’t even know what is going on.


Even with this fixed, the ref trace cannot find ref leaking during runtime (only after the program was shut down). Thus, this cannot be used as an integrated test.


I propose a new ref tracing facility that monitors every ref count change for each object.

Main facility

We will record something like a map
PyObject* → ref_count

This won’t be a problem for reused addresses since the old ref count will drop to 0, then the new ref count will increase from 0. And for objects that are leaked, their addresses will never be reused. Thus, we can use the c debugger to look at it.

We should make sure the sum of ref_count equals sys.gettotalrefcount().

API

When requested, we will make a snapshot of this map. Later this map will be taken out to be compared with the current map.

This does consume a lot of extra memory and computing time but is affordable with modern computers.

2 Likes

If anyone is interested, my demo (for x64 Linux) is available at GitHub - sunmy2019/cpython at memory-tracking-demo

Try compiling the cb.cpp with C++20 as a shared library, then link it (also the libstdc++) to the python.
g++ -g -O3 -shared -fPIC -c -std=c++20 cb.cpp -o libcb.so


With my demo, I tackled real-world problems: gh-101859: Add caching of `types.GenericAlias` objects by sobolevn · Pull Request #103541 · python/cpython · GitHub

The output is like (this is a comparison of 2 checkpoints)

ref change of living 0x55ba28fb3940 (NoneType): 1,  10302 -> 10303
ref change of living 0x55ba290dc900 (int): -7,  1000001649 -> 1000001642
ref change of living 0x55ba290dc920 (int): 1,  1000002330 -> 1000002331
ref change of living 0x55ba2ada4ee0 (_ctypes.PyCPointerType): 4,  0 -> 4
ref change of living 0x55ba2b8ba010 (_ctypes.PyCPointerType): 4,  0 -> 4
ref change of living 0x7fc23b514050 (weakref.ReferenceType): 1,  0 -> 1
ref change of living 0x7fc23b517bd0 (weakref.ReferenceType): 1,  0 -> 1
ref change of living 0x7fc23b552ff0 (tuple): 1,  0 -> 1
ref change of living 0x7fc23b577b60 (StgDict): 1,  0 -> 1
ref change of living 0x7fc23bbaca10 (getset_descriptor): 1,  0 -> 1
ref change of living 0x7fc23bdc3a70 (tuple): 1,  0 -> 1
ref change of living 0x7fc23bf4e8d0 (getset_descriptor): 1,  0 -> 1
...
ref change of died   0x7fc23bb8c230 (list): -1
ref change of died   0x7fc23c015f30 (str): -1
ref change of died   0x7fc23bd74770 (getset_descriptor): -1
ref change of died   0x7fc23bd74ad0 (getset_descriptor): -1
ref change of died   0x7fc23bd74dd0 (builtin_function_or_method): -1
ref change of died   0x7fc23b5516d0 (tuple): -1
ref change of died   0x55ba2b8bb920 (_ctypes.PyCPointerType): -4
ref change of died   0x7fc23bb81a20 (weakref.ReferenceType): -1
ref change of died   0x7fc23b4f63f0 (StgDict): -1
ref change of died   0x7fc23bc027b0 (weakref.ReferenceType): -1
ref change of died   0x7fc23b576e40 (StgDict): -1
ref change of died   0x7fc23bba6440 (tuple): -1
...
ref change of type NoneType: 1
ref change of type int: 6
ref change of type str: -1
total ref change 6

This would be very helpful to have in main, thanks for adding it! I would encourage you to submit a PR.

:joy: Demo updated with a detailed build guide for Linux x64 now.

I currently make some hacks to the build system. If this was intended for main, I need someone who understands how to modify autoconf files to help me out.

1 Like