I’m proposing to add the PYTHON_GC_THRESHOLD environment variable to more easily control the “aggressiveness” of the cyclic GC. The default for that setting, for the non-incremental GC, is 700. That is quite aggressive in that the young generation is collected frequently. I suspect many programs would work reliability and gain some performance by setting that threshold higher. Itamar did some internal testig at Meta and found that 14,000 seems like a better threshold but it causes some services to crash with an OOM error.
This kind of “tuning knob” is quite low level and I think it could be better to have a high level tuning knob. Say, something like PYTHON_GC_STRATEGY, with possible values:
aggressive: set when you don’t care so much about spending extra time in the GC and want low peak memory use and prompt cleanup of resources like sockets and file descriptors. This would be similar to what we have as the default settings now.
throughput: set when minimizing execution time is most important. You don’t care so much about peak memory use or prompt resource cleanup. This would be similar to the “server” setting that some other other languages have. I don’t like “server” as a name since sometimes I’m very memory constrained on containers or VMs and I would want the “aggressive” setting.
latency: set when minimizing GC pause times is the most important objective. E.g. could be used by games that want to deliver a consistent FPS rate.
balanced: this could become the default setting and kind of a mixture of all. I think it would be less aggressive than what’s in 3.12 (better throughput, slightly higher peak memory usage)
I like the idea in the context of how GC works in 3.10 & 3.12 (and maybe also 3.13?), but how applicable is it going to be with changes in GC (like adaptive thresholds, incremental gc, etc)?
If these high level strategies are still meaningful with the coming changes in GC, that sounds like a good idea.
I tend to be hesitant about environment-variable-controlled knobs. From my experience, they tend to cause surprise in different subprocess scenarios (whether via multiprocessing, subprocesses using sys.executable, and subprocesses launching other flavors of child Python applications). These surprises appear in both directions (e.g. “I expected the child application to inherit the parent’s GC strategy but it didn’t”, and “I didn’t expect the child application to inherit the parent’s GC strategy but it did”).
If possible to also expose this via a -X flag, it could help with making the semantics more explicit (still leads to surprises, but fewer).
Yes, it would be an -X flag as well as a function like gc.set_strategy(). All those ways to set it have different pros and cons. I didn’t do an -X option for my PYTHON_GC_THRESHOLD PR because I wanted to keep the code as simple as possible in case it can go into the 3.13 RC. If it does get merged, I would add an -X option to the ‘main’ version as well. And it should become part of the config struct, so you can set it if you are embedding Python.
If these high level strategies are still meaningful with the coming changes in GC, that sounds like a good idea.
The idea of a higher level tuning knob is that it can do something sensible if the GC logic is changed significantly in an upcoming release. I guess in theory it could be useful to other Python implementations like pypy, etc. Some more research is needed to decide on what would be a small set of useful strategy types. Joannah Nanjekye could likely help us since her PhD research was related to this. I chatted a little with her at the core sprint about this.
It would make sense to look what other GC language implementations do and learn from that (e.g. Java, C#). My impression is that options for those are fairly low level and they only have meaning if you are using a specific GC implementation. Having a hundred different knobs that no one knows how to set it not good. Having one knob that doesn’t give enough control is not great either. It will be a bit challenging to find a balance.
The idea seems similar to the modern trend of operating systems offering high level “performance” vs “power consumption” setting combinations, without having to fiddle with all the individual power management settings.
It’s probably also OK if some of the strategies have no effect on different GC designs relative to their default settings.
On the bikeshed painting front, perhaps the non-default strategies should all start with “min-” to emphasise that they’re minimising usage of a resource?
“min-memory” (aka “aggressive”): avoids excessive memory use at the expense of potentially increased runtime overhead and potentially long GC pauses
“min-overhead” (aka “throughput”): avoids excessive runtime overhead at the expense of potentially increased peak and steady state memory usage, and without specifically constraining the duration of any particular GC run
“min-latency” (aka “latency”): avoids excessive GC pauses at the expense of potentially increased runtime overhead and potentially increased memory usage
The “min-” prefixes would also leave the door open to offering 7 pretuned presets rather than 4 (omit the prefix to split the difference between the minimised settings for that resource and the default balanced settings) if that ever seemed sufficiently useful to be worth doing.
Perhaps the word “strategy” is not the best since that implies how it’s done rather than the goal or objective. For power management settings, I’ve seen the word “plan” or “mode”. Some other possible options: bias, preference, pref, priority.
If we use one of those words, I think your “min” prefixes read better. We are not minimizing those things as much as possible. We are biasing the tuning to favour them vs other contradictory objectives. BTW, I don’t like the “aggressive” label but couldn’t think of something better.
TLDR; The short story is I support exposing such a configuration.
I have written something about the usefulness of threshold tuning
with a subset of experiments in this paper: https://dl.acm.org/doi/abs/10.5555/3566055.3566071
That’s a valid point - while I was reading those names with an inferred “as far as is reasonable” on the end, that inference was based on already knowing what I meant. Maybe low-memory/low-overhead/low-latency?
As far potential option names go, --gc-preset or --gc-tune might be clearer than “-gc-strategy”.
I suspect the option will be obscure enough that the exact names won’t matter much in practice - anyone motivated enough to go looking for the option in the first place is unlikely to be put off by the finer details.