Leaky abstractions and CPython performance

Background

Historically, CPython has exposed most if not all of the structures definitions used by the interpreter/runtime as public structures rather than opaque structures which should have been implementation details. It was with recent efforts to organize the header files in three parts:

  • Limited C API
  • Public C API
  • Internal C API

This post is all about Public C API and not about Limited C API & stable ABI.

Motivation

This post is mostly inspired by lazily compute line numbers and efficient implementation of integers . Specifically, modifying the structures of builtin objects for better performance or reduced memory usage.

Taking lazily compute line numbers as example of this, I proposed to change the implementation of the exception handling to defer the creation of traceback objects for most exceptions and make it lazy more details. The PR defers the computations of the line numbers of the traceback objects from frame objects. This change alone provides a 25% speed up when handling exceptions. There will be more speedup once we implement the whole idea of lazy tracebacks.

There were concerns about backwards compatibility as with this change, the tb_lineno of traceback objects will be computed lazily. The issue is that the PyTracebackObject structure is exposed in the public header file so can be regarded as a public structure.

However I have some objections to this

  • The structure is not documented at all. There is no mention of this structure in the official documentation
  • The structure is not used as an argument to any of the C API functions.
  • There are no comments in the header file about what the C fields mean or represent.

@markshannon proposed to change the name of the C level field, providing the C API function and document this is in porting to 3.12. I agree with this and seems the best way forward, however I created this post to get agreement on what is the rules on modifying undocumented structures. There are discussions around changing the long object implementation to reduce memory usage and use a tagged pointer or two structures etc but when that would happen there would be same concerns about backwards compatibility so it seemed worth discussing and standardizing to me.

So here is my proposal about modifying undocumented structures inspired by @markshannon proposal.

  • First document the change in the what’s new entry and provide code snippets for the change.
  • If the field is widely used, then consider adding a C API function for it else provide a Python level equivalent code in what’s new entry.
  • If we do end up providing C API function, if possible provide a compatibility shim for it in pythoncapi-compat to ease the code bases supporting old Python versions.
  • Change the C level field name.

In this particular case a 25% or more performance improvement in exception handling which is a core feature of the language feels significant to me and seems worth it.

Consider sharing your thoughts on this. Thanks!

2 Likes

Please also request a PEP 387 (Backwards Compatibility Policy) exception from the Steering Council. It might be a straightforward one to grant, but please don’t set a bad precedent by skipping that step. Consistency on determining things like what’s “widely used” is a big reason we have a SC.

For the proposal: Another option is adding API for all fields (and constructor, perhaps?), and making the struct fully internal, so users can switch at once.
Adding API for all fields and then only changing+renaming one would limit breakage now and make future breaking changes easier.

Okay, since we are requesting SC for a grant then moving traceback objects to internal API and providing API functions sounds better. I’ll create a GitHub issue and check with Cython and others.

IMO, it shouldn’t matter whether a structure is documented or not. What matters is how much real-world stuff breaks when you change something, and kind of transition plan you can come up with.

In the case of PyTracebackObject, I know jinja2 used to mess with the internal fields directly – though I think they stopped recently, so it’s probably fine here? But if we’re talking about our general policy for making changes, I think “does this break jinja2?” is a much more important question than “is this structure layout documented?”.

4 Likes

Jinja2 sets tb.tb_next instead of messing with ctypes, and uses CodeType.replace instead of building new code objects. So we’re only using documented Python interfaces now.

Jinja2 and Werkzeug both have code that inspects and changes tracebacks for debugging purposes. It’s easy to run our test suite if you want to test a change that might affect those interfaces.

I don’t want to hold up any improvements up on our account. As long as there’s a deprecation warning and I can figure out code to continue to support both versions of something until we drop the old one, I’m fine with Python removing things.

2 Likes

I don’t think we’re ready to define a standard policy for these kind of changes yet. I do agree with Nathaniel that “what breaks in practice” is more important than “was it documented”, though I assume jinja2 is just a random example that Nathaniel happened to think of, not literally the thing to check for. :slight_smile: A slightly less random example might be Cython, which does a lot of C-level hackery for performance, and is a fundamental dependency of most of the Scientific Python ecosystem.

I agree that PEP 387 is important, as is the ABI (if anything is in the ABI it must not change). I’m not sure whether you are really planning to request an exemption from the SC about the full C-level definition of traceback objects – if you do, what kind of documentation do you plan to provide to back up your request?

Lastly – can you edit your messages to make the hyperlinks to GH issues more readable? The default (?) text you pasted from GitHub has a lot of boilerplate besides the issue number and title, and that makes the messages harder to read.

1 Like

Sadly that’s discord trying to improve what people do and automatically replacing link text with page title fetched from the resource. I don’t know if that can be overriden, or if we must avoid inline links in this forum.

3 Likes

This is getting off-topic, but let me try this:

Here is a link to GH-95238

It seems to work with explicit markup (i.e., [text](link)).

2 Likes

I already said that before creating a issue on steering council I will contact Cython which may be affected on the GitHub issue to measure the impact. Also if the proposal is accepted and Cython is found to be affected I will fix it ASAP, I take the responsibility of fixing it.

Regarding the standard policy, IIRC there was a proposal about semi stable APIs but there doesn’t seems to be much progress on it otherwise I would have moved it to the “semi stable” API. As for today, no such thing exists so moving to the internal API and providing C API functions seems the best way forward.

To avoid confusion I already provided in background that this does not affects stable ABI so it would be better to avoid it in this discussion as it is irrelevant.

Do you see any issues with that? My next plan is to avoid materializing frame objects for tracebacks and that would require making tb_frame field lazy so getting an exemption for the whole structure seems better. Regarding documentation, the struct would be documented as opaque and we will provide getter function for the fields. I will properly document this in porting to 3.12. Feel free to share any other ideas you have.

Done.

I still recommend that we should try hard to keep our code backwards compatible with Cython whenever possible or feasible. Projects often ship generated C files in their source distribution and do not auto-generated new C files by default. This behavior is recommends by Cython:

It is strongly recommended that you distribute the generated .c files as well as your Cython sources, so that users can install your module without needing to have Cython available.

It is also recommended that Cython compilation not be enabled by default in the version you distribute. Even if the user has Cython installed, he/she probably doesn’t want to use it just to install your module. Also, the installed version may not be the same one you used, and may not compile your sources correctly.

That means most users won’t get the fix unless a project releases a new version with updated C files. Or users jump through additional hoops to install Cython and force each project to re-generate the files. The Cython docs do not recommend a single option or env var to force re-generation. Projects typically have different approaches… (yeah, it’s annoying).

1 Like

Just to repeat what I said on the Github issue - I don’t think the tb_line change should affect Cython so please don’t spend too long considering the implications of breaking Cython (specifically) with it.

Making tb_frame lazy would require an update to Cython (although it looks like a fairly small one - 2 lines). I’d be very surprised if you got to 3.12 without needing an update to Cython somewhere so it probably isn’t worth worrying too much about easily fixed stuff.

I recall Victor Stinner had tool for scanning top PyPI projects (or something similar?) to try to who uses what APIs, so that might be worth a look? (Obviously there’s a whole world of closed-source stuff that it doesn’t cover, but it’s a useful first pass)

I think the next step is for Kumar to draft something about the C-level traceback structure to the SC. @kumaraditya303 I’d be happy to help.

From the SC notes that were just posted in draft (PR) form on the SC repo it seems they approved PEP 689 (unstable API) back in May but are waiting for some edits before it is labeled as accepted. So we could propose to move the traceback struct into the unstable API.

2 Likes

There’s been a bunch of discussion about it on the SC. @encukou was/did bring it back to python-dev to discuss naming.

I thought I’ve deferred PEP 689 for now, but I see it’s part of a bigger PR that’s stuck. I opened #2769 with a bit of explanation.
Basically, the edits the SC requested sound trivial, but point to an issue I think should be solved first. Opening that conversation is on my TODO list… for a while now.

Okay, I assume that means “unstable APIs” are out of question now as the PEP is being deferred (for now).

Thanks for your help!

The PEP is being deferred for now so moving the structure to internal API seems the only way possible.

Okay, well making the details of the traceback at the C level internal doesn’t sound too bad.

To me, adding accessors & hiding the struct itself sounds like a good thing to do, even if/when we have the unstable API tier. Opting in to unstable API would require extensions to changes their code anyway, switching from member access to functions isn’t much worse.

Well, adding getter (and setter) functions doesn’t automatically make an API stable (consider the API for creating new code objects). And it adds complexity to the implementation (sometimes also to the caller) so it isn’t automatically what I would try first.

However, getters are essential when something must be computed lazily, as is the case here.

In the end, API evolution will always be a tricky thing in C.

Disclaimer: this comment doesn’t reflect my opinion or advice on how we should proceed neither a core dev nor as a steering council member.

As a datapoint: seems that it will be tensorflow. blender and many other projects uses the field directly. Although I assume it will be not too hard to port in many of these cases. In any case, this seems that is going to break a lot of projects, even if the fix is easy. A 5-minute search on GitHub shows a bunch of other projects that seem to use it as well and will be affected (including the ones I mentioned):