PEP 793 – PyModExport: A new entry point for C extension modules

Ah no please, don’t take it further than I intended :slight_smile: It’s meant as self control not literally but then again it’s strange times to be online so I’ll make it more neutral from now on.

No we do see the benefits, and it’s something we want. I am not familiar with this forum but we don’t mind a bit of push.

It just needs to happen without too much of disruption and back and forth. The maintenance work to keep things compatible across the product (Arch X compilers X PyVersion) is already massive for us and we are hoping that there is consensus at every level. The consensus would be more beneficial to you folks than it is to us.

As an organization with Steering councils, and whatnot, I take no pride in reminding that train has sailed :wink: So expectations are high and rightly so in my opinion.

But like I said no harm done, from SciPy’s perspective we would be happy to implement stable API and its implications on C extension modules.

1 Like

I’ve updated PEP 793 based on the discussions:

  • Remove the hook’s spec argument (see C API WG discussions)
  • Add const to arguments we don’t change (as requested by Antoine)
  • Show token use in the example (multiple people thought it’d help you find a module object, rather than check you have the right one)
  • Add rejected idea about changing PyModuleDef to no longer be a PyObject.

About the rejected idea: I tried implementing it, and I think it’s a hard-to-support hack. But a part of my attempt now lives in PyModuleDef_Init, as a sanity check that we can remove at any time.

1 Like

So far I have:

  • Scientific Python maintainers:

    • NumPy (Matti Piccus on the mailing list): If I were still developing PyPy, I would groan at the need to support yet another new C-API with its attendant corner cases, but as a NumPy developer I will just go with the flow and do whatever is recommended by the CPython core team. Module initialization is not a prime pain point of the NumPy project right now, and I hope it will not become so due to interpreter changes.

    • SciPy (Ilhan Polat here: asked on their discourse): “From SciPy point of view, this is not much of a design decision we need to make, just as the multi-phase initialization discussion lead to, if you folks say jump, we’ll jump since we typically don’t know the implications (yet) and so far we have been copy pasting what is provided to us.”

    • pandas: I did not ask yet

    • PyArrow: Antoine Pitrou: defers to Cython

  • Other core developers outside the C API WG

    • see Stefan Behnel, Antoine Pitrou.
    • at the core dev sprint we asked core devs (unfortunately I don’t have exact quotes):
      • @eric.snow (who’s been involved in the affected area of the import machinery lately): We should do this cleanup regardless of the free-threading benefits.
      • @colesbury the free-threader: No strong opinion between this and changing PyModuleDef (what I now list as a rejected idea in PEP 793)
  • Binding generator maintainers

If I may summarize (though I may be biased): compared to either Stable ABI or free-threading, this API isn’t the hard part :‍)

I might add that Steve – the dissenting voice in the C API WG – is writing his own PEP that currently requires PEP 793.

5 Likes

And will continue to require it, because it’s a good design (and better with the latest changes).

My PEP also schedules it to be added at the same time as we have an entire long-term stable ABI for users to migrate to, so that users can migrate all at once, with absolutely no pressure to migrate now even though it doesn’t buy them anything (and also without putting pressure on us to maintain yet another way to load modules).

2 Likes

I’ll add my late $2c with my NumPy, SciPy and PyWavelets hat on (aligned with what Matti and Ilhan said): we can pretty easily adapt to the proposed changes, and are happy to do so given the benefits. I also don’t have a preference between the two competing proposals - some disruption seems fine, as long as we get to the end goal here.

Your efforts on Stable ABI improvements are much appreciated by the way. Partly prompted by this discussion thread we had some comments on this on the SciPy issue tracker as well. As a result I wrote up a summary at Tracking issue: Stable ABI support · Issue #23791 · scipy/scipy · GitHub. Surprisingly (to me at least) we don’t seem that far off from being able to use the Stable ABI in SciPy, which would be a massive win for maintainability.

12 Likes

On behalf of the Steering Council, I’m pleased to announce that we have decided to accept PEP 793!

Thank you @encukou for the thorough work on this PEP and for engaging extensively with the community, the C API Working Group, and extension maintainers throughout the process.

Congratulations!

The Python Steering Council

17 Likes

Thank you!

A PR is up for review at #140556.

3 Likes

I’m in the middle of implementing PEP 793 in PyO3 and have hit a wrinkle.

According to PEP 793 – PyModExport: A new entry point for C extension modules | peps.python.org , support for the inittab was deferred.

However, PyO3 uses and tests the inittab, so PyO3 would ideally like to have a function that allows appending modules to the inittab using just an array of slots or something along those lines.

1 Like

Here’s my draft PyO3 PR, which has some additional context and shows how I’m using the new APIs: Use PyModExport and PyABIInfo APIs in pymodule implementation by ngoldbaum · Pull Request #5753 · PyO3/pyo3 · GitHub

I don’t think this should be a huge blocker for you (although I do think the inittab functions are worth adding).

It’s fine to export both the old and new init functions - for normal module import the new functions will get used preferentially, but you can leave the old ones around for inittab usage.

The inittab is mainly used for embedding Python and the Stable ABI is usually a bit pointless there (since you’re building and linking to an exact interpreter).

It’s possible that there’s some Rust-specific stuff I’m missing from the above.


Long-term this should be supported though - if the new interface is the preferred interface and that new features will only go into the new interface then they should be usable with the inittab.

3 Likes

Inittab API is in the Possible Future Directions section of the PEP. I plan to get to it.

It’s fine to export both the old and new init functions - for normal module import the new functions will get used preferentially, but you can leave the old ones around for inittab usage.

This almost works, but I’m hitting an issue that maybe wasn’t foreseen?

In a PyO3 test that uses the inittab, I hit a SystemError inside CPython:

---- src/lib.rs - doc_test::guide_pfr_calling_existing_code_md (line 625) stdout ----
Test executable failed (exit status: 1).

stderr:
Error: PyErr { type: <class 'SystemError'>, value: SystemError('module foo: Py_mod_name used with PyModuleDef'), traceback: Some("Traceback (most recent call last):\n  File \"<string>\", line 1, in <module>\n") }

This comes from here:

I think maybe that check should be relaxed, and only object if e.g. original_def->m_name is non-NULL when Py_mod_name is defined.

@encukou does that seem reasonable?

If that’s not possible, I have a workaround in PyO3. If I define two sets of slots - one for PyModExport and one for PyModuleDef, then it works. But it’s kind of ugly. I guess we can clean this up though if a future Python version adds an API for adding modules to the inittab without using the legacy module initialization hook. See this commit if you’re curious what the workaround looks like.

I’d rather lok into adding slots-based inittab API, and keeping that check. It looks like the inittab could use other improvement as well.

Py_mod_name is currently optional. As a workaround, you should be able to leave it out.

Another workaround I’ve been doing in mutliple-ABI experiments is organizing the array to put the “new” slots first, and giving m_slots an offset so that it points where the “old” slots start. It’s rather cumbersome with C initializer syntax, but you might have better luck.

It seems to be awkward in Rust as well. How much value is this check providing? I assume it’s trying to catch people using the API wrong. If COPY_COMMON_SLOT just skipped these slots it would make the API more friendly to use. At least it seems to me.

I’m proposing a solution in PEP 820. Sadly it’d more work for you (sorry!), but, it’d make things easier in the long term.
With that proposal, you’d have two sets of slots, but without duplication: the ”new” slots array can include/reference the the “common” ones. In C:

static PySlot myModule_slots[] = {
   PySlot_STATIC(Py_mod_name, "..."),
   PySlot_STATIC(Py_mod_doc, "..."),
   PySlot_STATIC(Py_mod_slots, my_slots_with_no_name_or_doc),
   PySlot_END,
}

If PEP 820 won’t make it, then it’ll makes sense to file off the edges in this API.


You could ask the same about the Py_tp_name slot itself. The import machinery takes the name from a spec. The slot is there to make the self-describing, and for future uses (making a module without a full spec), or other (non-CPython) tools.

If there’s any value in including the name, I think there’s value in making sure it’s specified unambiguously.

I think you made a mistake, and I might not have reconstructed the suggestion you meant to make.

If the user provided a PyModuleDef (called original_def here), the its original_def->m_name should always be non-NULL. In your code it’s set to name.as_ptr():

        let ffi_def = UnsafeCell::new(ffi::PyModuleDef {
            m_name: name.as_ptr(),
            m_doc: doc.as_ptr(),
            // TODO: would be slightly nicer to use `[T]::as_mut_ptr()` here,
            // but that requires mut ptr deref on MSRV.
            m_slots: slots_with_no_name_or_doc.0.get() as _,
            ..INIT
        });

Did you mean to always skip the check? That’d be possible, but I’d rather avoid it: it means CPython needs to choose which source of info has priority.[1]


  1. or it doesn’t choose, and it becomes an implementation detail that eventually ossifies to an undocumented quirk we still need to preserve “forever” ↩︎

I don’t think I ever committed the version of PyO3 where I triggered the SystemError.

If you take a look at this branch, you should be able to trigger the SystemError with:

cargo test --test test_append_to_inittab

You’ll need to have a build of Python 3.15 in your PATH. I see output like this:

     Running tests/test_append_to_inittab.rs (target/debug/deps/test_append_to_inittab-88429a9d987dcc25)

running 1 test
Traceback (most recent call last):
  File "<string>", line 2, in <module>
SystemError: module module_fn_with_functions: Py_mod_name used with PyModuleDef
test test_module_append_to_inittab ... FAILED

In this branch, m_name and m_doc are only set for builds targeting Python 3.14 and older, but I still hit the SystemError.

I’ll try to take a look at PEP 820 to see if it helps for Rust, I see it uses macros to accomplish what it’s doing, and those are sometimes tricky to wrap.

Your suggestion was to “only object if e.g. original_def->m_name is non-NULL when Py_mod_name is defined”. But if (original_def->m_name) (your suggestion) and if (original_def) (the current CPython code) should be equivalent.
Did you mean to always skip this check?

I don’t think I’m set up properly. I get a bunch of warnings and a linker error :‍(

Yeah, for struct initialization. They’re pure syntax sugar; feel free to omit them if Rust’s syntax is ergonomic enough.

I don’t understand this. How are they equivalent? From my reading of moduleobject.c, original_def is only NULL if execution enters the code path using the new FromSlotsAndSpecAPI.

In my case, I’m using the legacy PyInit hook, which is getting called by PyImport_AppendInittab, and original_def is not NULL, but original_def->m_name is NULL, along with m_doc.

You should be able to reproduce the SystemError I’m seeing in a pure C extension that defines a PyModuleDef with m_name and m_doc set to NULL and with a slots array that contains Py_mod_name and Py_mod_doc entries, and a PyInit module initialization hook function, and then calling PyImport_AppendInittab, passing it the module initialization function.

Ah, I see. The confusion was pn my side. Sorry for that!
Will send a PR tomorrow.

1 Like

I started with allowing m_name=NULL, but allowing invalid state tends to gum up the machine.
Allowing PyModuleDef fields (not just name, but e.g. state size) to not match the resulting module breaks a few pieces of code I could find. More importantly, I’m not sure I could find all of them in CPython, let alone any other user that comes in contact with PyModuleDef.

So, let’s keep the PyModuleDef correct: the relevant slots may appear, but they must have the same value as the corresponding def field.
Does that work for you?

I sent this as PR #144340.

2 Likes