I think the issue with Fraction is that is uses ABCMeta (abstract base class). The abstract base class library use a cached lookup for checking various for subtyping. This is an area where sharing the type between sub-interpreters would be problematic and would require making the caching interpreter/thread local. This is not work we have done yet.
We will add a section about this. We couldnât see how to make this work in a safe way with sub-interpreters.
We have been looking at your example on discussion on Gauging interest in arbitrary object immortalization and shared object proxies.
I am bit unsure about two things
- Are calls to
SharedObjectProxys asynchronous, or synchronous? If synchronous is there a possiblity of deadlock? - Are the parameters passed to a method on a
SharedObjectProxyalso proxied? If not, can that lead to sharing?
@ZeroIntensity should I post these questions on the other discussion (not sure on the etiquette of overlapping conversions on DPO)?
You can post on the other thread, itâs easier to keep track of things. To answer your questions:
- Theyâre synchronized based on the GIL, because all it does is switch to the interpreter. On FT, itâs sort of asynchronous through the per-object locking. I donât think thereâs any chance of deadlock.
- Parameters can use the same sharing mechanisms as everything else with subinterpreters. So, things like strings can be shared directly, and only objects that have no other option will be put in a proxy. As of now, my POC implementation always shares, but thatâs just because I was lazy.
If you work with a proxy approach, then sharing classes across interpreters may not be needed at all:
WHen retrieving data in another interpreter, it could make use the other interpreterâs version of the same class.
Note that having interpreter local classes is the current behavior both for subinterpreter as for multi-processing shared data, using pickle and passing byte-serialized data around, and no one is surprised or harmed by that.
This problem with abc caching preventing deep freeze refers to a more fundamental issue with the proposal: deep-freezing has a large blast radius; it introduces action-at-a-distance. Deep-freezing affects pure Python code, in a way more than free-threading or subinterpreters themselves would â it changes the behavior of pure Python objects.
Real-life Python object graphs in non-trivial programs (that would benefit the most from race-free data sharing) are unlikely to be as clean as what the examples in the PEP would suggest. The existence of metaclasses, __dunders__, etc. in the object graph mean that drawing a line between freezable and non-freezable parts of pure Python, given the reality of how object graphs look like scenarios where the use of subinterpreters will likely be beneficial, will be a lot of work. Notably, the standard library has not been designed with freezing in mind, and it may as well be impossible to make most of the stdlib work with freezing.
For now, we can limit deep freezing to just a small, predetermined number of classes (such as exact list, dict, set, tuple), whose deep-frozen behavior can remain fully under our control. Or, we donât need deep freezing to be so strict. To quote the PEP:
A strict interpretation of deep immutability does not permit an immutable object to reference a mutable object. This model is both easy to explain and understand, and an objectâs immutability can be âtrustedâ â it is not possible for an immutable object to change through some nested mutable state [1]. At the same time it limits the utility of freezing as many Python objects contain types outside of the standard library defined in C, which must opt-in immutability before they can be frozen.
It isnât likely that we can forgo all mutable states altogether. Some mutable states will still live in our object graphs, injected by the interpreter or stdlib, and thatâs probably fine â we have per-object locks for that (though we want to avoid locking as much as possible). We âtrustâ the interpreter and stdlib not to mess with them in ways that break desirable behavior of user objects (this does not mean that interpreter/stdlib have to always support freezing, just that they behave in responsible ways with mutable states along the consenting adults principle).
For a realistic assessment of this proposal, we need a freezing model that
- extracts the most benefit (performance or otherwise) out of freezing (we need ways to measure such real-world benefit),
- has a clear model to work with (and preferably leading to a future specification),
- fits well with the existing designs of [C]Python,
- allows the separation of responsibility between users, stdlib contributors and third-party library authors,
- doesnât place too much burden of maintenance and evolution on the above parties.
These same basic points are valid for all proposed features, but especially in the case of deep freezing where the blast radius in Python-land is large.
Sorry â I wasnât clear enough! If we want to be able to safely share immutable objects by references between subinterpreters (which is one of our goals), then we need to fortify the Python interpreter against data-races, just like what is happening in free-threaded Python. Our hope here has been to enable the same kind of shared-memory parallelism that free-threaded Python supports with subinterpreters.
I hope that makes more sense. I agree that the world wonât end but it also prevents us from doing something which would be very useful and efficient for subinterpreters.
All that is happening in free-threaded Python is that core data structures are being altered to do their own locking, rather than using the GIL. The free threaded changes have no user visible effect, they are purely internal to the object implementation. As far as I can see, thatâs nothing like what is being proposed here, which does have a user API, which when used changes object semantics fundamentally.
I repeat - why? Why isnât it sufficient to trust people to use data structures correctly? I accept that having something that doesnât require you to ensure safety yourself when using it is nice, but itâs far from necessary. Thatâs my point - this proposal seems to be based on unfounded claims that itâs not possible to write data race free code without deep immutability. And therefore, that significant disadvantages are justified because the benefits are so significant.
Iâd really like the convenience of immutable data structures if Iâm writing concurrent code. But Iâm not willing to pay for that convenience by having to write all of my code defensively, because I canât be sure that I wonât be passed an object that (contrary to its declared/inferred type) could fail if a mutating method is called on it.
Iâve been wondering the same thing. I was also looking through PEP 734 which proposes subinterpreters in the stdlib, the section on Interpreter Isolation says
CPythonâs interpreters are intended to be strictly isolated from each other. That means interpreters never share objects (except in very specific cases with immortal, immutable builtin objects). Each interpreter has its own modules (
sys.modules), classes, functions, and variables. Even where two interpreters define the same class, each will have its own copy.
So it seems the motivation might be primarily to address this restriction. However the PEP 795 draft doesnât really make this explicit, the most direct reference is possibly in Motivation - Immutable Objects can be Freely SharedâŚ
Pythonâs Global Interpreter Lock (GIL) mitigates many data race issues, but as Python evolves towards improved multi-threading and parallel execution (e.g., subinterpreters and the free-threaded Python efforts), data races on shared mutable objects become a more pressing concern.
The python api is slated for 3.14 so I donât know what restrictions the runtime places on data sharing between subinterpreters. As subinterpreters have been in the C api since 1.5, itâs possibly a historical decision that Iâm unaware of the context on.
The 3.14 concurrent.interpreters docs say
By default, most objects are copied with
picklewhen they are passed to another interpreter.
And
There is a small number of Python types that actually share mutable data between interpreters:
So I think this PEP makes more sense through the lens of subinterpreters and less sense from the perspective of free threading.
This is mostly true. In free-threading, there is a lot of heavy lifting going on behind the scenes to ensure that the Python interpreter would not crash or corrupt itself if a Python program is poorly synchronised. However, if you want to make use of the multi-threading, your have to follow certain protocols, like proper synchronisation, or your programs might silently compute a bogus result, crash, etc. Multithreaded programming with mutable state changes how you need to program.
Immutable objects on the other hand âbehave the sameâ regardless of whether they are used in a single thread or shared across multiple threads. So from that perspective, there is less of an API after you have frozen the object.
I hope this makes sense. I am not trying to argue that one is better or worse than the other â my point is that multithreaded programming always comes with a cost, and it is a question of where you want that cost to appear.
Ah, this is a great question. The answer is that this would be like removing the GIL without adding all of the amazing stuff that the free-threaded Python people have added. The subinterpreters build keeps the GIL and achieves parallelism by letting each subinterpreter have its own GIL, and furthermore each subinterpreter believes that it is running in complete isolation so that its actions will never race with any action carried out in another subinterpreter.
Sharing objects across subinterpreters breaks this isolation which subinterpreters require for soundness. If the objects are immutable, it is easy to maintain the invariants that the subinterpreters rely on simply by making reference count manipulations on shared objects atomic. But if the shared objects can be mutated, the we also need something like the per-object lock that free-threaded Python uses to ensure that poorly synchronised programs donât accidentally blow up the interpreter or worse.
Oh, no absolutely not! I think this is a mistake on our part in the way we tried to divide our big idea that we presented at the Language Summit into different PEPs. The immutability stuff was the easiest bit to carve out that made sense on its own (to us). The reason why trust is not sufficient is the one that I pointed out above â the same reasons why Python simply isnât removing the GIL and trusting that all Python programâs were correctly synchronised (more or less).
It is possible to extend the optional type system to capture things like immutability, but we would like to punt on it for now to limit the scope of this PEP.
Hi Marc â see my response to Paul which tries to unpack this. You are right in your answer. The isolation between subinterpreters remains the same.
In the case of subinterpreters we enable object sharing (as you point out) and that is clearly already possible in free-threaded Python. Still, as also pointed out by Paul who clearly has reservations, having immutability is very convenient when writing concurrent code. So we think that immutable objects are useful also in free-threaded Python.
I assume you mean the isolation would be relaxed for frozen objects? (Yes from your third comment)
Agreed. I think part of the friction with this proposal is itâs quite heavy handed if youâre just considering traditional threading because those same data races were probably there before theyâre now just more likely to happen with free threading. Not to dismiss that, but itâs unlikely to truly be the introduction of a bug, I guess.
I disagree with a lot of that. CUDA is the classic example, IIRC some of their matrix multiplication functions arenât exactly IEEE754 compliant and only promise accuracy to within a couple of bits of precision because the GPU threading can change the order of operations and float math is a whole thing. In that case even if youâre doing everything perfectly you can get different results each time you run a program (Iâve had to fix this).
I would argue multithreaded programming changes how you need to program. Sure, avoid mutable state as much as possible but it will always be there and will always bite you in the ass if it can.
Ah, yes â indeed. Thanks for making that point clear!
Maybe, but also if Python programmers start to embrace threading in a big way, we are going to see more of these bugs. For example, if you import a library that uses threads internally â it can be very hard to know whether an object you get from that library is safe to access directly or if you need to figure out what is the appropriate lock to hold before access. And now we are back to the problem of virality or blast-radius that this PEP is getting pushback for â in order to reason about what objects you can safely access or not, you are going to have to understand the object graphs, and how objects are reachable across threads, and if you get something wrong, your program could just silently compute slightly wrong results.
I donât think you disagree with me? You are simply clarifying â and if so I agree with you â that just because you get synchronisation right it does not mean that your programs wonât compute a bogus result.
Did I get that right?
I think that you emphasize that the mutable state is the problem and your proposal reduces that impact. My point is threading is hard.
Got it! I will say that mutable state is part of what makes threading hard, but not the only thing.
EG there are already tools in the stdlib for creating âimmutableâ classes - @dataclass(frozen=True) is probably good enough for most use cases to pass an instance between threads. I keep hoping to get a better frozen dictionary than creating a dictionary and immediately wrapping it in a collections.MappingView, but one can dream.
I can only speak from my experience.
I used to maintain a threaded library used by broadcasters, we had a function for the user to give us data and a function for the user to get the result. Everything else was hidden from the user.
I suspect numpy/pandas/scipy etc make some level of use of threading, thereâs also RAPIDS which we use now, itâs a similar input/output model. If you mess with the internals, youâve got to know what youâre doing so reading of manuals is required. I donât know how much this will change with free threading.
I think the PEP should clarify the benefit of sharing types/modules/functions between sub-interpreters as the primary benefit, highlight that the current solution is âeverything gets pickled or duplicatedâ and that would definitely reduce the friction. Itâs unclear how much use this will see in free threading. Iâve not tried the free threading builds, it took a while for numpy and such to get a compatible build. concurrent.interpreters is slated for 3.14 so again probably not something usable yet. Iâm definitely more in favour in hindsight. I probably wouldnât use it much unless Iâm using subinterpreters but only time will tell.
On a related note, do you know of any open source projects that make heavy use of subinterpreters yet? I know itâs only available in the C api unless youâre on a pre-release. Most of my work is performance stuff of all kinds, SIMD/multithreading/just-go-faster so Iâm interested.