Feedback on abstract for pyconfr? Universal Python extensions

I’m going to be controversial and comment on the abstract rather than the idea :slight_smile:

Personally I think the abstract is reasonable. It’s obviously a talk where you aim to convince people to come round to your point of view and I think the abstract is clear what the topic is and what your aims are. The talk itself is where you’ll have to do the convincing…

I’m not sure that all the people you’d have to convince would necessarily be at PyConFR (but I don’t know that with absolute certainty). So it’s possible you may not have the right audience.

As others have pointed out, your aim here is to inspire other people to take action. So there’s clearly a high barrier to cross in terms of how convincing you’d need to be.

But personally I’d come to see the talk if I were attending PyConFR (I’m not I’m afraid) and that’s mostly what the abstract has to do.

2 Likes

Let’s say that yes I’d like to try to get some funding to reach a better state for the Python ecosystem with a JIT friendly API and universal wheels for most of its pillars. I think considering the potential impact of such project, we could get corporate and institutional support.

But to get funding, we need (i) an official statement from the community that this would indeed be useful and (ii) a clear practical technical strategy to reach our objectives, thought by a group of competent people specialized in the CPython C API, Python implementations, tools creating extensions (like Cython) and package implementations. It would be a plus if the conclusion of such group would be validated by the Steering Council.

Then, me (and actually many other people) could find the money.

Actually, I’m a bit surprised that the question of funding is such a big issue.

I mean such plan would actually cost a quite limited amount of money. For example, with 2 millions of euros, one could already do a lot. And compared to the impact of the project and the usage of Python in public institutions and private companies, this is a quite small amount of money. I mean, I guess this is the order of magnitude of total amount for Matlab licenses for French universities each year. This is less than one ERC grant.

I think we should not worry too much about money and more about how to build a strong project, with a technical realistic plan for success.

Really, these arguments “who is going to do the work?” “You ask for other people to do the work.” “You should do the work yourself” are a bit strange. Of course, I understand them, but it is also very sane to try to build projects, evaluate their feasibility and their impact and then potentially try to find funding for them.

1 Like

That’s because (as far as I am aware) those arguments don’t exist. The challenge here isn’t in the “good ideas are 10% inspiration” part of the problem, it’s in the “good ideas are 90% perspiration” part, which is why folks keep bringing up the question of funding.

The best write up I know of regarding the problems with the C API and potential ways to improve it would be @vstinner’s Python C API site: https://pythoncapi.readthedocs.io/

If you’re not already familiar with Victor’s activities, he’s one of the most active core developers in working towards making the stable ABI more readily usable. I’m not clear on how many of those contributions are on Red Hat work time vs his own personal time, but browsing the PEP index for Victor’s name is a decent way to find previous and proposed improvements related to this topic. @encukou’s name is another good one to search for. (While I’ve assisted with some of their related projects, my name isn’t a good one to search with for this purpose, since I’ve worked on even more unrelated PEP topic areas than Victor or Petr).

Finding a financial sponsor (beyond Red Hat, to the extent that Victor and Petr spent time on these problems on RH work time) that understands the nature of the work needed would indeed be genuinely valuable, as it isn’t a question of a single simple solution to one key technical problem, it’s a question of approaching individual third party projects that use the C API, getting to know the contributors, understanding why they’re not already using the stable ABI, and if they’re amenable to making the change, providing assistance to actually do the work, and then get it reviewed, merged, and released. This may sometimes involve going back to CPython to advocate for C API enhancements if genuine technical barriers to migration are discovered for particular projects.

If Oracle set things up believing they were facing a technical programming problem, and only discovered they were facing a social change management problem (with technical concerns mixed in) after they had already started, then I’m not surprised they decided the issue couldn’t be resolved the way they were trying to solve it.

Motivating multiple largely independent volunteer communities to collectively solve a problem is a very different task from directing a department of paid employees to do so, since every new interaction with a new community becomes an exercise in diplomacy without access to the lever of “We’re paying you for your time and can direct you to spend it as we wish”.

So I guess my key point is that the CPython core devs aren’t the people you have to convince. We disagree on the urgency of the problem, but we don’t really disagree on its existence or the nature of it. Trying to convince us thus feels like mistargeted effort (outside the potential question of an explicit statement from the PSF and/or the SC that this is an area where the PSF would be willing to help steward dedicated funding), since some of the core devs are already the most active people in doing the work to address the issue.

Edit: it belatedly occurred to me that the C API Working Group’s charter, together with its initial API review and catalogue of known problems is effectively the explicit acknowledgement you’re looking for.

3 Likes

Thanks for this interesting message.

It really depends on convince of what? About a HPy-based solution and a universal ABI, I’m pretty sure that many CPython code devs are not convinced at all. I tend to think that most don’t know the benefices that HPy could bring and consider that it is side project related only to PyPy and GraalPy, so not very interesting for most Python users. Of course, others know HPy, in particular @vstinner and @encukou that you mentioned! However, HPy is quite far from their remarkable work on the CPython C API. I mean that HPy does not solve the general issue of the CPython C API. The goal of HPy is not to replace the CPython C API.

It is interesting to see that the principle of JIT-friendly universal extensions are not mentioned in the PEP produced by the C API working group.

It’s not too surprising since the scope of this group is the evaluation and the evolution of the CPython C API. In particular, alternative C APIs for Python are out of scope.

Similarly, the idea that a lot of issues could be fixed or avoided through an external project has not been evaluated by the C API working group.

Unfortunately, this is not the case. The idea of JIT-friendly universal extensions with debug mode that Python interpreters should natively support has not been discussed. Neither the technical and social feasibility of the implementation of this feature.

I see three possibilities regarding this feature:

  • (most probable) No universal extensions. “Evolution” changes of the CPython C API with in particular improvements of the limited API. Numpy (and similar projects) might finally be able to use the limited API in few years. Some packages might even load specific wheels for alternative interpreters.

  • Universal extensions through the limited API. Is it possible? What is needed?

  • Universal extensions with a healthy and used HPy (and Cython/Pythran/PyO3 backends using HPy).

As far as I know, these different paths, the associated tasks, and their technical and financial feasibility have not been publicly thought and evaluated.

This is where I find it hardest to follow your arguments. At times you seem to be arguing for “a universal API” and at other times you seem to be insisting that the solution must be HPy.

The core devs are convinced that a universal API is a good thing but it’s hard to evolve from the current C API to that goal. It’s being worked on, but it’s slow, and no-one yet knows what form the final API will take. On the other hand, the core devs are open to the possibility that HPy might be a viable universal API, but they are not willing to bet everything on that particular solution.

Are you only interested in HPy as the official universal API? If so, then you’re confusing the issue by talking in generalities.

I don’t think it’s reasonable to expect the core devs to support multiple C APIs. Supporting just one is plenty, thanks. So evolving the C API to be something more like HPy is definitely in scope, but adding HPy (or any other alternative API) as a supported API is out of scope, not for technical reasons but simply because of resourcing constraints.

Again, you don’t seem to be making it clear what you want. Do you genuinely want the core devs to support two C APIs for CPython? Because I think if that’s what you’re asking for, the response is pretty obviously going to be “sorry, we don’t have the resources for that, addiotional APIs will need to be externally developed and supported”. Which is the model HPy was built around, and it appears that even supporting the one API, HPy, is more work than the project can sustain. So I don’t see why you think the core devs have that much resource to commit to supporting HPy.

I don’t think I never wrote anything about “a universal API”. I wrote about a JIT-friendly universal ABI, which is a very interesting and useful idea introduced by HPy, but which I guess could be implemented through an evolved limited API (with ideas taken from HPy).

My sentence “About a HPy-based solution and a universal ABI, I’m pretty sure that many CPython code devs are not convinced at all.” was a response to @ncoghlan who wrote “So I guess my key point is that the CPython core devs aren’t the people you have to convince”. Anyway, this does not seem very important.

The subject is complicated but I think I start to be not so unclear. But thanks for giving me the opportunity to try to be clearer.

I point out the importance for Python (the language) to have in few years an official JIT-friendly universal ABI (like the HPy universal ABI), which of course CPython would support natively.

For the details of how we should get that (in particular with HPy or with an improved limited API), I think this should be studied with a dedicated working group (gathering in particular CPython, HPy, PyPy, GraalPy, Cython and Numpy people). With a clear and serious plan, I think we will then be able to find funding and support, since this feature would radically change the experiences of Python users, package maintainers and developers of Python interpreters.

It seems to me that the most natural and simple way to get the wanted result (an official JIT-friendly universal ABI natively supported by CPython) would be with a healthy and used HPy. CPython would just need to include a bit of HPy code to be able to natively import HPy universal extensions. So in this case CPython would not need to “support two C APIs”.

However, I don’t know enough to be able to know what would be better, so it really makes sense that a dedicated working group works and gives its conclusions. Anyway, to get funding, we need this step.

Sorry, my bad. I don’t really understand what you mean by a “JIT-friendly universal ABI” and as a result I muddled the terminology. Assume I meant “JIT-friendly universal ABI”.

OK, given that you’ve been explicit, please clarify what a “JIT-friendly universal ABI” is. Specifically, what about the existing stable ABI (which I think is “universal”, but without knowing what you mean by that term I can’t be sure) isn’t “JIT-friendly”, and why would you think the core devs aren’t interested in making it JIT-friendly if it isn’t, given that CPython has a JIT?

The rest of your post seems fine to me. Good luck in setting up such a working group.

Final bullet point in the working group’s charter:

Alternative C API designs are very much in scope for the working group.

The current active discussion is around how to build a next generation ABI into the reference interpreter without breaking abi3, as that capability is going to be critical to having a CPython runtime that maintains support for its current concrete ABI while also gaining support for a new more abstract one.

I’d personally assume most of the core devs don’t even think about the problem (there are plenty that are only interested in the Python APIs and leave the C API to those of us that are interested in it). Fortunately, as long as the C API working group are in favour of a proposal, they have sufficient collective influence to be able to make things happen (note that Petr and Victor are two of the founding WG members).

Framing the problem as helping the WG to formulate their pitch to everyone else (whether within CPython or within third party extension module projects) would likely be a decent way to go, though.

Edit to cover some additional specific points:

Yes, it has. That’s why GitHub - python/pythoncapi-compat: The pythoncapi-compat project can be used to write a C extension supporting a wide range of Python versions with a single code base. exists under the main CPython github org rather than being a third party project. HPy remains independent, but that’s because the bits that are more tightly coupled to CPython can live in the compatibility library rather than directly in HPy, not because of any major objections to the way HPy works.

I didn’t say it needed to support two APIs, I said it needed to support two ABIs. Packaging tools need to be able to determine when a runtime should accept abstract ABI wheels, and the runtime needs to be able to import extension modules using the abstract ABI.

Having a way to expose just the API elements needed to support the abstract ABI is then a build time convenience to help ensure an extension module isn’t inadvertently relying on any exported symbols that aren’t part of the abstract ABI.

I’ll admit to sharing @pf_moore’s confusion about what it means for an ABI to be “JIT friendly”, though. The question of ABI compatibility with moving GCs has definitely come up, but I can’t recall any concerns specific to JIT’ed interpreters (unless it’s just “No function pointers in the API”, which is a variation on the moving GC concern)

Great! The “new abstract ABI” is very interesting.

A new abstract ABI could be a huge step for Python! The project to have it and to change the ecosystem with this new feature should be funded and supported.

It seems to me that a plan like what I tried to describe in Feedback on abstract for pyconfr? Universal Python extensions - #26 by paugier could be useful.

A dedicated working group should describe why it is useful, what is planed, how the different projects (new CPython abstract ABI, HPy, …) are related, what has to be done at different levels of the ecosystem, etc. It will then be much easier for people to understand the general project, to gather funding and support, and more generally to know what they can do to help the project. A wider WG with people from different projects seems interesting.

The most important aspects are what you cited, in particular really not assuming implementation details, even reference counting, giving compatibility with moving GCs (so I guess handles on the API side).

Another very interesting aspect for performance is related to the possibility to avoid boxing/unboxing of small objects (see for example “typed functions” as described in https://dl.acm.org/doi/abs/10.1145/3652588.3663316). Of course, one needs a quite advanced Python JIT to be able to use this!

I agree that the qualifier “JIT friendly” is not technically perfectly exact (or rather it depends what one means by “JIT”) but I think it is very interesting for communication since it refers to a problem of the current C API with bad practical consequences for Python users (to be very unfriendly with existing efficient Python JITs) and that an efficient JIT is now also expected for CPython. People who desire a more efficient CPython should support a JIT friendly, universal and abstract ABI.

2 Likes

The immediate motivation for sorting out an updated stable ABI is the free threaded builds: Stable ABI/Limited API for free-threaded builds

That isn’t a transition to a full handle API, but it does involve making object layouts opaque, which is at least friendlier to having the API pointer values represent object handles instead of object instances.

1 Like