Reverting the incremental GC in Python 3.14 and 3.15

As requested by @hugovk, I started a new topic yesterday to discuss ideas for rehabilitating inc gc:

He wants to keep this topic focused on 3.14/3.15. So please continue this in the new topic (or create your own).

@nas already posted a patch there that that does that (plus some more)..

It’s nevertheless a major improvement (slashing peak memory use) on the stress tests we have so far. A promising start :smiling_face:.

@nas;a original 3-gen design was focused on timely collection of shorter-lived cycles, and I’m confident that moving, as you suggest, back closer to that will pay off. It’s the gen2 collections that caused “long pauses”.

3 Likes

As requested by this topic’s author, and acknowledged by participants, please take development talk for improving the incremental GC to Improving incremental gc.

The reverts are done for 3.14 and 3.15, thanks to @sergey-miryanov and @nas for doing those, and initial test results are looking good, thanks to @zanie and @EWDurbin for helping with those.

Release plan:

  • 3.14.5rc1 on Saturday 2026-05-02 (new)
  • 3.15.0b1 on Tuesday 2026-05-05 (already planned)
  • 3.14.5 final on Friday 2026-05-08 (new)
13 Likes

Update: this release is on hold pending a signing issue for the Windows build. Will update here when fixed, which might be in a day or two.

3 Likes

A post was merged into an existing topic: Improving incremental gc

@mkowal Please continue development discussion in Improving incremental gc.

@mkowal Please continue development discussion in Improving incremental gc.

1 Like

Out now! Please test!

6 Likes

This would be a hard breakage of what patch releases mean. I have done Linux distributions and one of the thing you learn with a lot of pain is to never regress in a maintenance release. If a change is good for 90% of the users and bad for 10%, you still can’t do it in a patch release. You need to approach 1000:1 or so …
Because that is what creates the trust that allows users to follow your patch releases every time, without individually judging, which only few people have interest and time for.
If you subscribe to my logic, we are in a bad place. Because we have a hard regression for many in a release and can’t just fix it, because that would mean a regression for a significant fraction of users. (Again 10% is significant.)

Here are what I would do (as a Linux distributor):

  • If we can, in 3.14.5, ship both GC’s with a startup option, the 3.14.0 behavior stays the default! For 3.15, we could of course change the default (or revert to just one GC, if maintenance burden is a concern).
  • If we can not ship both, we must stay with the 3.14 inc GC for the 3.14 series. In that case, I would consider an early 3.15 release. Very early, actually. Maybe 3.15 becomes what 3.14.5 would be with very few things on top (we may have a few no-brainers). This would be visibly admitting that we have made a blunder. So be it.
  • If none of that is workable: We’d need to do a massive information campaign, telling everyone that we have a 3.14.5 that is a 3.15 … Linux distros will hate us, because they have the decision between a rock and a hard place: Break their users (by changing GC) or sticking to 3.14.4 with self-backported patches.

Just my 0.02€.

4 Likes

The ship has sailed three days ago - Python 3.14.5 has been released on May 10.

Then it’s a good thing that Python predates SemVer and so we don’t follow it. :wink:

We have also been doing releases for decades and in coordination with Linux distros.

It’s a bugfix in a bugfix release, so I disagree it can’t be fixed.

I think you might be misinterpreting what the incremental GC did; it just shortened GC pauses at the apparent expense of worse memory pressure for some people. The generational GC is battle-tested, safe, and what people are used to. This isn’t like we removed a module from the stdlib or something. Most people won’t notice, and those that will asked for this change because of the problems they have run into with the incremental GC. There is no semantic change, just a perf shift which we make no guarantees about anyway.

We had an RC to make sure the change worked for people, it’s already gone out (as Stefan pointed out), and both RC and the final point release were published as widely as possible.

15 Likes

Well, let’s hope not to many people have come to rely on the new inc GC or adjusted to it then … The release notes fail to inform about possible downsides.

1 Like

If someone is somehow relying on the incremental GC I would be interested in finding out how they did that.

8 Likes

Try talking to people who write games in Python. Long gc pauses can kill their “frame rate”, and lead to jerky motion. That may be as close to apps that want “real time” guarantees as Python uses get.

I made a pitch before that the news “should have” spelled out that people who fell into relying on inc gc could find that their workarounds backfire in 3.14.5. These include things like setting threshold0 to a “very low” value to force much more frequent inc collections, and/or calling gc.collect() to force full collections at times the app doesn’t much care. Those are effective workarounds people stumbled into when fighting OOM regressions due to inc gc.

Alas, just plain disabling cyclic gc at the start could make things even worse. Cycles abound in, e.g., “frameworks”, graph applications, and GUI systems.

All changes have downsides too, but we followed the tradition here of only noting the upsides of the change in our top-level news.

Which is, alas, just as true of introducing inc gc to begin with. We guarantee nothing about gc behavior except that objects won’t be deleted until they’re unreachable. It’s all “quality of implementation” stuff - pragmatics. Whether a person calls inc gc or old 3-gen gc “a bug” has no dispositive basis. Each works better for some apps, but worse for others. So do you want jerky animation, or run out of RAM? The answer for me is obvious, but because I don’t do animations in Python :wink:. My own apps don’t much care either way.

11 Likes

I would like to follow up on Tim Peters’ comment.

At my company we use Python for game backend development, specifically as a scripting language for a server metagame engine, i.e. handling logins, game accounts, matchmaking, arenas, etc.

Due to our specifics, we have a number of Python objects that live long enough (hours, days, weeks). We don’t operate in a http-server-like reply-response manner, but rather we have actors that operate inside the cluster, and these actors are long-lived (e.g. actor for a connected player, actor for a running arena, etc.).

These long-lived actors affect the Python garbage collector, and in return the garbage collector heavily affects the overall performance of our game cluster. Max GC pauses are so significant that it became the main factor in how we choose the number of entities we can process on a single node.

In our tests, Incremental GC showed a significant performance improvement. Hence why we were very saddened by the decision to revert it.

On the chart below you can see average GC pauses with incremental GC (green) and without it after the revert (yellow).

The max GC time comparison is even more extreme; with the old GC pauses can reach up to 3 seconds, while the new incremental GC is unnoticeable.

(I am sorry I have to use an external site for image, I am not allowed to upload here)

With the old generational GC we can handle up to 10 000 actors on a single node; with the incremental GC we easily increased this number up to 30 000 actors. Also, we didn’t encounter any memory leaks in our scenarios.

We would really appreciate having some sort of switch to enable the incremental GC back. If there is something we can help with, please let me know.

31 Likes

Thank you for the real-world feedback!

It sounds like you’re running on substantial dedicated machines, and available memory isn’t a problem. In that case, there’s scant downside to inc gc.

Because it can let dead cycles go underclaimed for a longer time, they can pile up and increase max RAM use. That’s the downside to inc gc, which can kill programs dead in memory-constrained environments. So it would be interesting to know know how max RAM use in your app is affected.

Also interesting to know whether you have significant trash in cycles. The easiest quick way to check is to look at the return value from a gc.collect() call. If it’s 0, running cyclic gc at all was a waste of time: it found nothing to collect.

A number of my own apps create no cycles. I start them with gc.disable(), and so spend no time in cyclic gc (inc gc or the old way). Some of these have run for months with max RAM staying steady,.

Best guess from your graphs is that you have few trash cycles, but no way to know from here. It’s not trash cycles that slow gc, but the total number of container objects (objects that point to other objects). They all have to be crawled over to detect whether cyclic trash exists. Traversing the entire object graph multiple times is expensive, and more so the more objects and edges in the object graph.

There’s a different topic for discussing ways inc gc could be improved:

Looks promising to me, but it’s too late to change what’s already reverted in 3.14, and the sense I get from the Powers that Be is that it’s also too late to make the gc method switchable for the upcoming 3.15.

The switch to inc gc was made to begin with in the absence of a PEP, or major battle testing, so “once bitten, twice shy” seems to apply.

None of which solves your current problems, alas. Best I can suggest is staying with 3.14.4 (the last version with inc gc) until a better approach is implemented.

5 Likes