Incremental gc is being reverted for 3.14, to be replaced with the older 3-gen collector, and the same will be true of 3.15:
Inc gc seemed to be primarily aimed at reducing “stop the world” long pauses, and has had some success at that. But it came at the cost of sometimes greatly increasing memory use, as trash cycles pile up waiting for the incremental collector to get to them.
I wrote a toy app to explore the limits (link to a public gist):
Toy cyclic gc code · GitHub .
Don’t be put off by “toy app”
. The difference between a toy and a real app in this kind of context is that the former can display pathological behaviors much faster. and without dozens of confounding factors muddying the picture.
The app depends on psutil, but not on Python 3 - it sticks to Python 2 features.
The high-order bit here is on the behavior of “gen0” collections. Under the old 3-gen collector, a gen0 collection collects the cycles created, and which became trash, since the last time gen0 ran. It’s aimed at reclaiming short-lived cyclic trash, which is common in many apps (if they create cycles at all).
The app is an infinite loop that creates a new cycle on each iteration, and after iteration 1000 also converts one to trash on each. So neither truly short- nor long-lived. There are never more than 1000 reachable cycles.
Under Py 3,13 (older gc), it works as I expect. The gen0 threshold is 2000 (same as under 3.14), and around iteration 2000 a gen0 collection occurs. It reclaims 1000 trash cycles, and moves 1000 to gen1. Same story at iterations 4000, 6000, 8000, … it cleans up half the trash created along the way, quickly and smoothly. Other trash is awaiting a gen1 collection.
The current (dev) docs for gc.set_threshold() seem to say that should still be the case (plus maybe more):
For each collection, all the objects in the young generation and some fraction of the old generation is collected
But that’s not what I’m seeing. Nothing is collected at 2000 iterations. And still not by 4000 iterations. Or 6000, 8000, … Nothing at all gets collected until about the 20 thousandth iteration. gc is invoked along the way (you can confirm by adding more output to the toy’s gc callback function), but it returns without collecting anything until iteration 20_000. Then it collects about 18_000 trash cycles.
So it’s off to a bad start, which gets worse over time. But not without bound. It eventually (after about 750K iterations) reaches a “steadyish state”, always with over 90K trash cycles awaiting collection, but not more than 100K.
So that’s the first thing I’d think about: is there ever a “good reason” to skip gen0 performing its traditional function? I don’t understand the current “work to do” logic, and especially not how “the math” can end up making it negative(!) at times. But intuition says “work to do” should always include gen0.
