Optimizing Python memory allocation fragmentation

yylogo · November 18, 2024, 2:52pm

In our project, when Python is initially started, it will occupy about 270Mb of memory, and after running continuously for three months, it will occupy more than 800Mb of memory.
After analysis, we found that some old generation memory occupied an entire arena, causing the arena to not be released.
I was thinking, can the idea of introducing PGO and generational GC solve this problem?

Classify the arena according to the frequency of release or some other method,
The memory that is always recycled in the new generation will use an arena when applying for memory, and the resident memory that is always in the old generation will be placed in the same arena when applying.

In general, it is a bit like the idea of adaptive specialization: based on the experience of previous operations, optimize the subsequent memory allocation method.

MegaIng · November 18, 2024, 3:51pm

I don’t believe that generational GC can be implemented without copying the objects, which is not possible in the current memory model of CPython.

The issue is that unless you can guarantee this, it wont help with your issue. Over long enough time frames, some objects will keep the arena alive. (if we can’t copy objects out, which we can’t)

You can potentially try to use PyPy, which IIRC has generation GC. Alternatively, restart your python process regularly.

I think that the faster cpython project has generation GC somewhere on the radar, but it requires quite foundational changes and it might take a few years for this to even be started.

barry-scott · November 18, 2024, 6:22pm

If your application allows then restart the process when it is using more memory then you are happy with. If your application is structured to use multiple processes then restarting the biggest will not give you any down time.

yylogo · November 19, 2024, 2:47am

Yes, my idea is to combine generational GC. I want to implement this function myself, and I want to ask if my idea is worth writing a new PEP.
My core purpose is to reduce fragmented memory, similar to adaptive specialization, using the experience of previous runs to replace bytecode specializations such as BuildMap with BuildMap_OldGeneration.
The core idea is to adjust the arena where the memory is allocated based on previous experience, thereby reducing fragmented memory. This function is different from generational GC. Generational GC solves the problem of GC being too slow. I want to solve the problem of an old generation occupying an arena.

From the beginning of allocation, objects that may be old generations are placed in the same arena.
Is this idea worth calling a new PEP?

Alex-Wasowicz · November 19, 2024, 11:59am

I don’t think PEP is the correct route, this is more of a implementation detail. It would be nice to have official document describing GC, but as far as new feature goes, any Python implementation already should have automatic memory management. (And asking user to manually reloc is not automatic).

That being said, if U would want to (just for example) merge GC-s of multiple processes into a single GC, then that would need a PEP (in my opinion).