Here’s a proposal for the C API that’s been cooking for… um, a long time.
Still rough around some of the edges, but I hit a self-imposed deadline to get it out, so here it is.
(Please ignore formatting, this is reST rendered as Markdown.)
Motivation
Python’s C API currently contains two extendable structs used to carry
information, notably:
PyType_Spec
PyModuleDef
Each has a family of C API functions that use the structure as input,
creating a Python object from it. (Each family works as a single function,
with optional arguments that got added over time.) These are:
PyType_From*
functions forPyType_Spec
PyModule_FromDef*
forPyModuleDef
Separating “input” structures from runtime objects allows the internal
structure of the object to stay opaque (in both the API and the ABI),
allowing future CPython versions (or even alternative implementations) to
change the details.
Both structures contain a slots field – an array of
tagged unions <https://en.wikipedia.org/wiki/Tagged_union>
_,
which allows for future expansion.
(In practice, these are void
pointers taged with an int
ID.)
This spec aims to update these structures, in particular to allow:
- making backward and forward compatibility easier to maintain (for
example, avoiding the need for new functions with additional arguments) - improved type safety, and avoiding what’s technically undefined behaviour
Replacing ModuleDef
The more immediate motivation for this proposal is that
- the
PyObject
memory layout differs between regular and free-threading builds, - the
PyModuleDef
struct is usually allocated statically (usingPyModuleDef_HEAD_INIT
), and PyModuleDef
is a Python object.
If we want to introduce an ABI subset usable with both builds, we need to change the main mechanism to create modules.
Let’s design it as best as we currently can, so it can last at least as long as PyModuleDef_Slot
.
Example
This proposal adds API to create classes and modules from arrays of slots,
which can be specified as C literals using macros, like this::
static Py_Slot myClass_slots[] = {
PySlot_STATIC(tp_name, "mymod.MyClass"),
PySlot_SIZE(tp_extra_basicsize, sizeof(struct myClass)),
PySlot_FUNC(tp_repr, myClass_repr),
PySlot_INT64(tp_flags, Py_TPFLAGS_DEFAULT | Py_TPFLAGS_MANAGED_DICT),
PySlot_END,
}
...
PyObject *MyClass = PyType_FromSlots(myClass_slots, -1);
The macros simplify hand-written literals.
For more complex use cases, like compatibility between several Python versions,
or templated/auto-generated slot arrays, as well as for non-C users of the
C API, the slot struct definitions can be written out.
For example, if the transition from tp_getattr
to tp_getattro
was happening now and the user wanted to support CPython with and without
tp_getattro
, they could add a HAS_FALLBACK flag:
static Py_Slot myClass_slots[] = {
...
{ // only used if unsupported
.sl_id=Py_tp_getattro,
.sl_flags=PySlot_HAS_FALLBACK,
.sl_func=myClass_getattro,
},
{ // only used if if the slot above is not supported
.sl_id=Py_tp_getattr,
.sl_func=myClass_old_getattr,
},
PySlot_END,
}
Rationale
Here we explain the design decisions in this proposal.
Using slots
The main alternative to slots is using a versioned struct
for input.
There are two variants of such a design:
-
A large struct with fields for all info. As we can see with
PyTypeObject
, most of such a struct tends to be NULLs in practice.
As more fields become obsolete, either the wastage grows, or we introduce
new struct layouts (while keeping compatibility with the old ones for a while). -
A small struct with only the info necessary for initial creation, with other
info added afterwards (with dedicated function calls, or Python-level
setattr
). This design:- makes it cumbersome to add/obsolete/adjust the required info (for example,
in :PEP:697
I gave meaning to negative values of an existing field; adding
a new field would be cleaner in similar situations); - increases the number of API calls between an extension and the interpreter.
We believe that “batch” API for type/module creation makes sense,
even if it partially duplicates an API to modify “live” objects. - makes it cumbersome to add/obsolete/adjust the required info (for example,
Using slots only
The classes PyType_Spec
and PyModuleDef
have explicit fields
in addition to a slots array. These include:
- Required information – the names:
PyType_Spec.name
andPyModuleDef.m_name
.
This proposal adds name slots, and makes them required. - Non-pointers (
basicsize
,flags
) – originally, slots were intended to
only contain function pointers; they now contain data pointers as well as
integers or flags. This proposal uses an union to handle types cleanly. - Items added before the slots mechanism (
PyModuleDef.m_slots
itself was
repurposed fromm_reload
which was always NULL;m_traverse
or
m_methods
predate it).
We can do without these fields, and have only an array of slots.
A wrapper class around the array would complicate the design.
Also, if fields in such a class ever become obsolete,
they’d need their own deprecation mechanism.
Nested slot tables
The array of slots can reference another array of slots, which is treated
as if it was merged into its “parent”, recursively.
This complicates slot handling inside the interpreter, but allows:
- Mixing dynamically allocated (or stack-allocated) slots with
static
ones.
This solves the issue that lead to thePyType_From*
family of
functions expanding with values that typically can’t bestatic
(i.e. it’s often a symbol from another DLL, which can’t bestatic
data on Windows). - Sharing a subset of the slots to implement functionality
common to several classes/modules. - Easily including some slots conditionally, e.g. based on the Python version.
Nested “legacy” slot tables
Similarly to nested arrays of PyType_Slot
, we also propose supporting
arrays of “legacy” slots (PyType_Slot
and PyModuleDef_Slot
) in
the “new” slots, and vice versa.
This way, users can reuse code they already have written, without rewriting/reformatting
it, and only use the “new” slots if they need any new features.
Fixed-width integers
This proposal uses fixed-width integers, uint16_t
, for slot IDs and
flags.
With the C int
type, using more that 16 bits would not be portable,
but it would silently work on common platforms. We can carefully avoid values
over UINT16_MAX
, but we’d still waste 16 bits on common platforms.
With these defined as uint16_t
, it seems natural to use fixed-width
integers for everything except pointers and sizes.
The proposal does not use bit-fields and enums, whose memory representation is
compiler-dependent, causing issues when using the API from languages other
than C.
The structure is laid out assuming that a type’s alignment matches its size.
Memory layout
On common 64-bit platforms, we can keep the size of the new struct the same
as the existing PyType_Slot
and PyModuleDef_Slot
. (The existing
struct waste 6 out of 16 bytes due to int
portability and padding;
this proposal puts those bits to use for new features.)
On 32-bit platforms, this proposal calls for the same layout as on 64-bit,
doubling the size compared to the existing structs (from 8 bytes to 16).
For “configuration” data that’s usually static
, it should be OK.
Single ID space
Currently, the numeric values of module and type slots overlap:
Py_bf_getbuffer
==Py_mod_create
== 1Py_bf_releasebuffer
==Py_mod_exec
== 2Py_mp_ass_subscript
==Py_mod_multiple_interpreters
== 3Py_mp_length
==Py_mod_gil
== 4
This proposal use a single sequence for both, so future slots avoid this
overlap. This is to:
- Avoid accidentally using type slots for modules, and vice versa
- Allow external libraries or checkers to determine a slot’s meaning
(and type) based on the ID.
The 4 existing overlaps means we don’t reach these goals right now,
but we can gradually migrate to new numeric IDs in a way that’s transparent
to the user.
The main disadvantage is that any internal lookup tables will be bigger
(if we use separate ones for types & modules, they’ll contain blanks),
or harder to manage (if they’re merged).
Supporting both NULL-terminated arrays and explicit sizes
In C, it is natural to write array literals terminated by a NULL/zero element.
An alternative is accepting non-terminated arrays with an associated element
count. Notably, this allows:
- Treating a pointer as an array of a single element
- Specifying an arbitrary subset of a larger array, without copying memory
- Better performance if an extra loop is needed to count the elements
This proposal allows arrays-with-a-count as an alternative to zero-terminated
ones. The API for them is meant more for code generators than for people
hand-writing C literals.
Specification
The following functions will be added:
PyObject *PyType_FromSlots(PySlot *slots, Py_ssize_t n_slots);
PyObject *PyModule_FromSlots(PySlot *slots, Py_ssize_t n_slots);
PyObject *PyModuleDef_FromSlots(PySlot *slots, Py_ssize_t n_slots);
The first two create the corresponding
Python object from the given array and slots.
PyModuleDef_FromSlots
creates a ModuleDef that describes multi-phase
initialization, to be returned from a PyInit_*
function. (Its result
will be an internal subclass of PyModuleDef
.)
The n_slots argument may be -1
, which means slots is zero-terminated.
Otherwise, it gives the size of the array.
(In this case, zero entries inside the array are invalid.)
Slot structure
The PySlot
structure will be defined as follows::
typedef struct PySlot {
uint16_t sl_id;
uint16_t sl_flags;
union {
uint32_t sl_array_size;
};
union {
void *sl_ptr;
void (*sl_func)(void);
Py_ssize_t sl_size;
int64_t sl_int64;
uint64_t sl_uint64;
};
} PySlot;
(The actual definition will be more complex, mainly for C/C++ compiler
compatibility.)
sl_id
: A slot number, identifying what the slot does.sl_flags
: Flags, defined below.- A 32-bit union, whose meaning depends on
sl_flags
. This specification
defines only one option:sl_array_size
: explicit array size; seePy_SLOT_SIZED_ARRAY
below
- An union with the data, whose type depends on the slot.
General slot semantics
When slots are passed to a function that applies them, the function will not
modify the slot array, nor any data it points to (recursively).
After the function is done, the user is allowed to modify or deallocate the
array, and any data it points to (recursively), unless it’s explicitly marked
as “static” (see Py_SLOT_STATIC
below).
This means the interpreter needs typically needs to make a copy of all data
in the struct, including char *
text.
Flags
sl_flags
may set the following bits. Unassigned bits must be set to zero.
-
PySlot_OPTIONAL
: If the slot ID is unknown, the interpreter should
ignore the slot entirely. (For example, ifnb_matrix_multiply
was being
added to CPython now, your type could use this.) -
PySlot_STATIC
: The contents of the slot (and all data it points to,
recursively) are statically allocated. Thus, the interpreter does not need
to copy the information.
Implied for function pointers. -
PySlot_SIZED_ARRAY
:sl_ptr
points to an array, whose size is given
insl_array_size
. Without this flag, arrays are zero-terminated
(as with the existingPy_tp_members
, for example).
Must not be used for numbers or function pointers. -
PySlot_SKIP_IF_NULL
: Skip this slot if its data is NULL/zero. Intended
for templated or auto-generated slots.
(The check will be done before type specific handling, so all fields of
the data union (sl_ptr
,sl_int
, …) must be zeroed.) -
PySlot_HAS_FALLBACK
: If the slot ID is unknown, the interpreter will
ignore the slot. If it’s known, it should ignore subsequent slots up to (and including)
the first one without HAS_FALLBACK.Effectively, consecutive slots with the HAS_FALLBACK flag, plus the first
non-HAS_FALLBACK slot after them, form a “block” where the the interpreter
will only consider the first slot in the block that it understands.
If the entire block is to be optional, it should end with aPy_slot_end
with OPTIONAL flag.
Convenience macros
The following macros will be added to the API::
#define PySlot_DATA(NAME, VALUE) \
{.sl_id=Py ## NAME, .sl_ptr=VALUE}
#define PySlot_FUNC(NAME, VALUE) \
{.sl_id=Py ## NAME, .sl_func=VALUE}
#define PySlot_SIZE(NAME, VALUE) \
{.sl_id=Py ## NAME, .sl_size=VALUE}
#define PySlot_INT64(NAME, VALUE) \
{.sl_id=Py ## NAME, .sl_int64=VALUE}
#define PySlot_UINT64(NAME, VALUE) \
{.sl_id=Py ## NAME, .sl_uint64=VALUE}
#define PySlot_STATIC(NAME, VALUE) \
{.sl_id=Py ## NAME, .sl_flags=Py_SLOT_STATIC, .sl_ptr=VALUE}
#define PySlot_END {.sl_id=0}
New slot IDs
The following new slot IDs will be added:
Py_slot_end
(just a new name for zero)- With
sl_flags=0
, marks the end of a zero-terminated slots array. - With
sl_flags=Py_SLOT_OPTIONAL
, this slot is ignored.
- With
Py_slot_subslots
: array ofPySlot
structures, treated as if
they appeared at this point in the array. (XXX: HAS_FALLBACK blocks can’t span subslots?)Py_tp_slots
: array of “legacy”PyType_Slot
structures.Py_mod_slots
: array of “legacy”PyModuleDef_Slot
structures.
New slots will be added to cover existing members of PyType_Spec
and
PyModuleDef
:
Py_tp_name
(mandatory for type creation)Py_tp_basicsize
(of typePy_ssize_t
!)Py_tp_extra_basicsize
(equivalent to settingPyType_Spec.basicsize
to-extra_basicsize
)Py_tp_itemsize
Py_tp_flags
Py_mod_name
(mandatory for module creation)Py_mod_doc
Py_mod_size
Py_mod_methods
Py_mod_traverse
Py_mod_clear
Py_mod_free
New slots will have unique numbers (that is, Py_slot_*
, Py_tp_*
and Py_mod_*
won’t share IDs).
Slots numbered 1 through 4
(Py_bf_getbuffer
…Py_mp_length
and Py_mod_create
…Py_mod_gil
)
will get new (larger) numbers.
The old numbers will remain as aliases, and will be used when compiling for
Python versions below 3.14.
Backwards Compatibility
Yes!
Forward compatibility too please!
Security Implications
None known
How to Teach This
Rewrite the “Extending and Embedding” tutorial to use this.
Reference Implementation
Not yet.
Rejected Ideas
Stuff that could be neat but is out of scope for this proposal:
PyType_ApplySlots
- slots for adding constants (à la
PyModule_AddIntConstant
) - module slot to create a type and add it to a module
Open Issues
Add yours.
Copyright
The usual.