Gauging interest in arbitrary object immortalization and shared object proxies

Hi,

Over the past several months, I’ve been playing around with a bit of a hobby/research project for immortalizing mutable, arbitrary objects. For a while, it was a solution in search of a problem, but I think there might be some actual use cases now.

My implementation can be found here. You can try it out:

import sys

obj = object()
sys._immortalize(obj)
assert sys._is_immortal(obj)

The cool part is that an immortal object is not leaked, it is deallocated at the end of an interpreter’s lifetime using a virtual garbage collection mechanism. I won’t go into too many details, but there’s some hacking of the allocator to ensure that object structures remain valid while executing destructors. As such, an immortalized object must be allocated using CPython’s object or memory domain (PyObject_Malloc and PyMem_Malloc). This contract already exists for the free-threaded build, and I don’t think it would be too hard of a sell to establish this contract for the GIL-ful build. I haven’t ever seen code in the wild that uses the system malloc or something like that to allocate an object.

The main case that I developed it for (apart from fun) is object sharing between subinterpreters. My design uses an immortal object proxy that wraps any Python object, and then all the methods of that proxy switch to the correct interpreter and call the wrapped object, returning the wrapped result in a new object proxy. This works on both FT and non-FT builds, because if you have an attached thread state for an interpreter, you are allowed to call any object in it. The tradeoff is that the proxy nor the wrapped object can be deallocated for the lifetime of the interpreter. To visualize, here’s some psuedo-code:

class SharedObjectProxy:
    def __init__(self, wrapped):
        self.interp = PyInterpreterState_Get()
        self.wrapped = wrapped
        sys._immortalize(self)

    # An example using __call__().
    # This pattern is used for every dunder method on the type.
    def __call__(self, *args, **kwargs):
        tstate = None
        new_tstate = None
        if self.interp != PyInterpreterState_Get():
            new_tstate = PyThreadState_New(self.interp)
            tstate = PyThreadState_Swap(new_tstate)

        result = SharedObjectProxy(self.wrapped.__call__(*args, **kwargs))
        if tstate is not None:
            PyThreadState_Clear(new_tstate)
            PyThreadState_Swap(tstate)
            PyThreadState_Delete(new_tstate

        return result

Using this design, an instance of SharedObjectProxy is usable from any interpreter, which is really helpful for sharing objects that cannot (easily) be serialized. I designed a proof of concept for this a while back, but I think I need to update it to work with 3.15. There’s definitely room for improvement here, but conceptually, is this something people would like to see in the standard library, especially with the recent acceptance of PEP 734?

I’m not aware of other places where immortalization would be helpful outside of subinterpreters, but that’s the other reason why I’m opening this thread: are there other cases where safe immortalization would be useful? Or, are there other plans for the core that would be hurt by AOI?

7 Likes

Is it safe to share mutable objects between multiple interpreters? How do you ensure that these objects remain consistent? Especially if each interpreter has its own GIL.

Using SharedObjectProxy, all operations on an object are executed in the interpreter to which the object belongs. How does it scale with the number of threads? All operations are serialized if they need to switch to a different interpreter, no?

This is certainly a valuable type of proxy (e.g. most GUIs rely on only accessing their controls/widgets from the UI thread), so I think it totally belongs in our toolbox.

The permanent immortalization behaviour can probably be dealt with by making them explicitly closeable (which leaves the proxy alive but not the underlying object), and it ought to be possible to safely reference count the proxy (not a normal refcount, but its own count) so that it can be freed more eagerly in response to their owners closing them. Obviously the idea is that you don’t have many proxies floating around, and they should tend to be long-lived, which is why it doesn’t have to be “cheap”.

Though I’m still inclined to think that we’re better off with an object that can be transferred between interpreters and reconstructed on the other side. e.g. it might contain a (strong) reference to the original interpreter and original object, as well as enough object shape information (list of attributes) to be able to call back into the original interpreter. No immortalization required, though we would definitely run into references leaks in any case where a subinterpreter doesn’t shut down properly (or out-lives the original interpreter).

The main place it’s helpful is embedding and initialization, including for embedding in python.c, because it allows the host app to allocate objects. They may be statically allocated in read-only memory, which improves startup time. They can be used to e.g. allocate strings once and put them in sys for Python code to use directly, rather than copying everything.

The original intent was never to freeze arbitrary objects, and certainly not once the runtime is up and running. It’s meant for freezing objects defined before the runtime starts, so that they can be used while the runtime is running, and then freed after the runtime has completed. IOW, for embedders.

The value to subinterpreters is somewhat related, but mainly because the same people were interested in both cases and saw how it applied to both.

1 Like

Yeah, it’s all serialized by the interpreter. On the default build, it’s handled by the GIL, and on the free-threaded build it’s handled by all its tricks.

The scaling is probably similar to what would happen if you shared the object between threads normally, so it won’t be great. But efficiency isn’t the purpose; it’s supposed to act as a last resort when no other methods of transferring between interpreters work.

Me too, I definitely don’t think an object proxy should be the default method of transferring. For things like strings (or methods that return strings on the proxy), we can serialize and deserialize as usual.

I think options like this are great to have, so long as it doesn’t become a sort of convenient nuisance that keeps people from exploring and using the better (non last resort) options. I don’t have a way to judge if this would or not, but I think even if it does, it should just be a sign that we haven’t made the “ideal” convenient enough yet.

3 Likes

The Instagram web server has been using immortalization for years (we had flavors of this implemented as far back as Cinder 3.8, if not earlier versions).

It has been extremely valuable to immortalize the entire heap before forking web workers, in a pre-fork web server architecture.

See this 2023 article for more details: Introducing Immortal Objects for Python - Engineering at Meta

10 Likes

My feelings about this proposal are similar I think. If people are going to use subinterpreters, it’s useful to have a fallback option like this for when you need to work with an object that can be shared in a better way (although from the explanation I’m having trouble understanding how you ultimately get non-proxy results back).
But, I think as far as the overall topic of subinterpreters it’s more important to work on ways of truly sharing more objects across interpreters without serialization-and-copy.