API for Python 3.13 prevents use of 3rd party GC allocators

Thrameos · September 3, 2024, 4:06am

Changes in Python 3.13 object memory model have made it increasingly difficult to use memory alloc slot. As currently implemented it will be almost impossible to implement the JPype object model. JPype is a language bridge between Java and Python. As part of its implementation it must derive a base class which will be used as a mixin for a number of Python types such as PyLong, PyFloat, and PyException. As these types all have different memory layouts this would normally be prohibited.

To deal with this requirement, JPype was copying the formulas found in Python for similar types. Before Python 3.11 it would simply had a copy of the Python allocator which added 8 bytes to the end of all structures requiring Python types. As the 6 or so mixin types the JPype creates are all final and closed to expanding due to a meta class, there was no way to conflict with the extra memory. Starting in Python 3.11 those functions and structures became harder to access with the allocator disappearing entirely in 3.12. Jpype could still function by manually creating a phony type with the required allocation then type shifting to the required layout after allocation.

However, in Python 3.13, Python has make the memory model entirely inaccessable basically rendering the alloc and finalize slots useless to third party. Nominally, a new API in PEP 697 was supposed to allow third party class specific memory to be allocated, but unfortunately that doesn’t work for the JPype use case because the memory it allocates is within the basesize of the object and thus creates exactly the same conflicts. I proposed an alternative to PEP 697 and demonstrated in a patch in which the Python allocator can place class managed space in front of the allocator, but due to other time commitments I have never had time to write the required PEP for this solution to be considered.

Instead the Python memory model 3.13 is now entirely closed. The only way to create GC objects is to call the PyType_GenericNew or one of the private methods. And this call has no provision for allocating any space other than that of the standard layout. In addition both the memory before the object and after is now filled with private objects. The front contains a managed pointers for the dict and weak pointer (which was once added to basicsize on class creation), the area after the class now contains an unknown sized object for inline dictionaries if the size of the object has a certain base size and an item size of 0. The GC_Link method is inaccessible and the type system caches the incoming type meaning that one can’t even polymorph the object like 3.12 to create an object with a non-standard layout. In addition the job of allocator has expanded beyond simply linking the GC, initializing the object type pointer, and set up the reference count as it now also sets up the inline dictionary. In my opinion an allocator should do as little as possible (ie allocate memory and blank it), but this version is adding even more to that already complex path. As far as I am aware most of these changes were internal to the Python team. To try to document the changes I constructed a schema diagram of the current Python object model.

As the sizes of all these special entries are not accessible and the mechanisms to linking the memory allocation from the GC is hidden, effectively the allocator/finalizer slots are Python internal privates and thus no one can implement the same level of concrete classes that Python itself does. One can argue that 3rd party developers shouldn’t be calling private Python functions, though as there are zero examples of how to properly implement GC managed objects and no published API a 3rd party developer has no choice when trying to overcome a basic limitation of Python such as no true multiclass.

I would recommend that Python 3.13 reform its memory model before being released. One or both of the following solutions would alleviate the situation.

There should be some sort of public methods such that a 3rd party library is not locked entirely out of creating GC managed objects. It should stop automatically adding private unknown sized objects after the standard model which has always been accessible to 3rd party libraries like JPype. Doing so makes exceptionally difficult to implement multiclass or language bindings. In addition, there should be enough public accessible methods to allocate a GC managed space with pre and size fields. As knowing how big the standard object needs to be is required there needs to be a public method to get the required pre and size for the object allowing future Python versions to add or change the model as they see fit. (Example API: PyType_Alloc(type, pre, post); PyType_Presize(type), PyType_Size(type, nitems), PyType_Init(obj) all of which leave the GC model behind the scenes) Python internals should use these same methods so that everyone implementing concrete classes is on the same playing field and the API object layout stops shifting in unpredictable ways.
The PEP 697 should be expanded beyond its limited use case to move its allocation from within basic size to managed space before the object. There is already a proof of concept that demonstrates the concept.

Both solutions have use cases. The first is more flexible especially if the memory requirements for the object do not fit the simple single item size Python one. JPype which needs immediate O(1) access to its extra memory and has closed classes is more like the first case. The second is more tailored to objects that require fixed slot space and want Python to manage it for them, and is a stepping stone to the grand unified object model as it allows for true multiple inheritance without conflicts. If neither solution is implemented then JPype will need to use the most horrible hacks imaginable of monkey patching the basic size and turning off the inline dict function after the type is allocated.

Thank you for your consideration.

thomas · September 3, 2024, 10:44am

Setting aside the rest of the points brought up for the moment, I will say that this is practically impossible. Python 3.13 has been past beta 1 for nearly four months. We’ve had the first release candidate. The final release candidate is scheduled for today. We cannot release Python 3.13 on schedule while considering these fundamental changes. I don’t think we could release it this year with these changes. The options are to release Python 3.13 as-is, or not release Python 3.13.

Thrameos · September 3, 2024, 2:15pm

I would hope that number 1 could be completed in less than a year given all the mechanics are present. The inline dict is a fairly catastrophic change to object system hence it triggered itself only when it thought the object followed a specific model. With the exception of the inline dictionary location fixing the API amounts to adding exposing one static and 2 macros. I would estimate the to address this work as 2 to 8 hours.

As for the timeliness of my reporting the issue, lets do a quick recap. This issue was present in December of last year on comments to PEP 697 hence my patch to implement solution 2. PEP 697 was a response to my objections in 3.11 when I first raised the issue. Given the issue of hiding symbols and rearranging the GC object model had heen reported at the start of the Python 3.13 cycle I did not expect that it would be in this state. I was asked to submit a full PEP for my patch even be considered, but the changes I have issue with have not been discussed according to encukou. This is the third major version that my concerns had not been fully addressed (though PEP 697 was a good attempt). Were the latest changes discussed and tagged to me I would have made my objections then. Sadly my work life has had me working 60-80 hours since that last December and my employment contract requires all tools used at work, be completed at work under billable time. Since my users reported the issue, it took me four weeks of documenting/code auiting to figure out the details of the memory model to even report the issue then addition month of work in my project to even find any workable solution (though that was because I only had 2 hours a week to contribute to my project.) Others have had plenty of time to come up with alternatives since I tagged it in Python 3.12 release. I am reraising the issue here at the request of vstinner. If this doesn’t get addressed and my horrible hack to make my project function fails I simply can’t afford to support Python 3.13 due to lack of time.

pitrou · September 3, 2024, 4:04pm

Just for the record, the “base class which will be used as a mixin for a number of Python types such as PyLong, PyFloat, and PyException” is only used for object instances created by JPype, right?

Thrameos · September 3, 2024, 10:56pm

Yes. JPype extends with CPython concrete types for each corresponding type in Java with its equivalent in Python. Thus java boolean, short, int, long, java.lang.Integer, java.lang.Short, java.lang.Long all inherit from PyLong. Java float and double with boxed types extend from PyFloat. All exception types extend from Python exception. The class tree for all types is given by Java meaning that if something inherits from java.lang.Object and then it inherits from exception, that means that type tree will be requesting JObject (base class Python object) and JException (base class Python exception) to be mixed together which is of course impossible without extending the memory layout like JPype has done. I would do the same with Python String if there were an API for lazy allocation of the data portion of the String.

Under the Python typing system this type of mixin is permitted, but only if none of the classes add to either the basicsize or the itemsize. Hence JPype has since Python 3.5 always simply added its memory to the space after the standard Python object. Attempting to add it into the Python system has always failed. PEP 697 was to address the issue but ended up missing our use case because it altered basicsize which meant the mixin in failed for the three Python types we extend from.

JPype of course is a very special use case as it it is reflecting an object tree inherited from another language. In most cases the user has some level of freedom to design their object tree. It is only language binding for Java or C# where something like this happens.

ncoghlan · September 4, 2024, 3:04am

Is declaring Py_BUILD_CORE or otherwise including CPython internal headers to get access to the relevant internals a potential workaround for 3.13?

Or are the required symbols entirely unavailable for dynamic linking, so playing header games would just result in link failures instead of compilation errors?

Thrameos · September 4, 2024, 7:11am

The symbols are unavailable. I tried a number of ways to access them either by including headers or by trying to coerce available symbols/references to invoke them. All paths had side effects other than adding to the basicsize calling the allocator and the reducing the size. But that solution means the structure is damaged for any accesses between. It also fails on a number of assertions. Thus my conclusion that the alloc slot is now effectively private as you cant build and gc allocated space outside of Python internals.