PEP 689 -- Unstable C API tier

Hello,
I’ve updated PEP 689.

Compared to the previous version, it adds a PyUnstable_ prefix to all API in the new unstable tier. I still think we could do without it (marking entire files with Py_USING_UNSTABLE_API), but it’s better than the underscore, which I believe should be reserved for private API.
Since functions will now need to be renamed when they’re moved from one stability tier to another, there’s more focus on not breaking pre-existing names until necessary (i.e. until an incompatible change is made).

What are your thoughts?

There’s one open question: with the PyUnstable_ prefix, the opt-in macro (Py_USING_UNSTABLE_API) might not be necessary.

Full text of the PEP
Abstract
========

Some functions and types of the C-API are designated *unstable*,
meaning that they will not change in patch (bugfix/security) releases,
but may change between minor releases (e.g. between 3.11 and 3.12) without
deprecation warnings.

Any C API with a leading underscore is designated *internal*, meaning that it
may change or disappear without any notice.


Motivation & Rationale
======================

Unstable C API tier
-------------------

The Python C-API is currently divided into `three stability tiers <https://devguide.python.org/developer-workflow/c-api/index.html>`__:

- Limited API, with high compatibility expectations
- Public API, which follows the :pep:`backwards compatibility policy
  <387>`, and requires deprecation warnings before changes
- Internal (private) API, which can change at any time.

Tools requring access to CPython internals (e.g. advanced
debuggers and JIT compilers) are often built for minor series releases
of CPython, and assume that the C-API internals used do not change
in patch releases. To support these tools, we need a tier between the
Public and Private C-API, with guarantees on stability throughout
the minor-series release: the proposed *Unstable tier*.

Some functions, like ``PyCode_New()``, are documented as unstable
(“Calling [it] directly can bind you to a precise Python version”),
and also often change in practice.
The unstable tier should make their status obvious even to people who don't
read the docs carefully enough, making them hard to use accidentally.


Reserving leading underscores for Private API
---------------------------------------------

Currently, CPython developers don't agree on the exact meaning of a leading
underscore in API names.
It is used to mean two different things:

- API that may change between minor releases, as in the Unstable tier proposed
  here (e.g. functions introduced in :pep:`523`).
- API that is *private* and should not be used outside of CPython at all
  (e.g. because it may change without notice, or it relies on undocumented
  assumptions that non-CPython code cannot guarantee).

The unclear meaning makes the underscore less useful than it could be.
If it only marked *private* API, CPython developers could change underscored
functions, or remove unused ones, without researching how they're
documented or used outside CPython.

With the introduction of an unstable tier, we can clarify the meaning
of the leading underscore, eventually making it OK to skip that research.


Not breaking code unnecessarily
-------------------------------

This PEP specifies that API should be renamed so that the
public/unstable/internal stability tier is expressed in function names.
Whenever this happens, the old name should continue to be available until
an incompatible change is made (i.e. until call sites need to be updated
anyway).
In other words, just changing tiers shouldn't break users' code.


Specification
=============

The C API is divided by stability expectations into `three “sections” <https://devguide.python.org/developer-workflow/c-api/index.html>`__
(internal, public, and limited).
We'll now call these *stability tiers*, or *tiers* for short.

An *Unstable tier* will be added.

APIs (functions, types, etc.) in this tier will named with the ``PyUnstable_``
prefix, with no leading underscore.

Internally, they will be declared in headers in a new directory,
``Include/unstable/``.
Users should include ``Python.h`` rather than using these headers directly.

These APIs will only be declared when the
``Py_USING_UNSTABLE_API`` macro is defined.
CPython will define the macro for building CPython itself
(``Py_BUILD_CORE``).

Several rules for dealing with the unstable tier will be introduced:

-  Unstable API should have no backwards-incompatible
   changes across patch releases, but may change or be removed in minor
   releases (3.x.0, including Alpha and Beta releases of 3.x.0).
   Such changes must be documented and mentioned in the What's New document.

-  Backwards-incompatible changes to these APIs should be made so that
   code that uses them will need to be updated to compile with
   the new version (e.g. arguments should be added/removed, or a function should
   be renamed, but the semantic meaning of an argument should not change).

-  To move an API from the public tier to the unstable tier, it should be
   exposed under the new ``PyUnstable_*`` name and the definition should be
   guarded with ``Py_USING_UNSTABLE_API``.

   The old name should be deprecated (e.g. with ``Py_DEPRECATED``), but
   continue to be available until an incompatible change is made to the API.
   Per Python's backwards compatibility policy (:pep:`387`), this deprecation
   needs to last *at least* two releases (without an SC exceptions).
   But it can also last indefinitely -- for example, if :pep:`590`'s
   :pep:`“provisional” <590#finalizing-the-api>`
   ``_PyObject_Vectorcall`` was added today, it would be initially named
   ``PyUnstable_Object_Vectorcall`` and there would be no plan to ever remove
   this name.

   In the following cases, an incompatible change (and thus removing the
   deprecated name) is allowed without an SC exeption, as if the function was
   already part of the Unstable tier:

   -  Any API introduced before Python 3.12 that is *documented* to be less
      stable than default.
   -  Any API introduced before Python 3.12 that was named with a leading
      underscore.

   For examples, see the :ref:`initial unstaple API <pep689-initial-list>`
   specified in this PEP.

-  To move an *internal* API to the unstable tier, it should be
   exposed under the new ``PyUnstable_*`` name and the definition should be
   guarded with ``Py_USING_UNSTABLE_API``.

   If the old name is documented, or widely used externally,
   it should continue to be available until an
   incompatible change is made (and call sites need to be updated).
   It should start raising deprecation warnings.

-  To move an API from the unstable tier to the public tier, it should be
   exposed without the ``PyUnstable_*`` prefix.

   The old name should remain available, possibly without requiring
   ``Py_USING_UNSTABLE_API``, until the first incompatible change is made
   otr the API is removed.

-  Adding new unstable API *for existing features* is allowed even after
   the feature freeze, up until the first Release Candidate.
   Consensus on Core Development Discourse or ``capi-sig`` is needed in the
   Beta period.

These rules will be documented in the `devguide <https://devguide.python.org/developer-workflow/c-api/index.html>`__,
and `user documentation <https://docs.python.org/3/c-api/stable.html>`__
will be updated accordingly.

Reference docs for C API named ``PyUnstable_*`` will automatically show
notes with links to the unstable tier documentation.


Leading underscore
------------------

C API named with a leading underscore, as well as API only available with
``Py_BUILD_CORE``, will be considered *internal*.
This means:

-  It may change or be removed *without notice* in minor
   releases (3.x.0, including Alpha and Beta releases of 3.x.0).
   API changes in patch releases or Release Candidates should only be done if
   absolutely necessary.

-  It should be documented in source comments or Devguide only, not in the
   public documentation.

-  API introduced before Python 3.12 that is documented or widely used
   externally should be moved to the Unstable tier as explained above.

   This might happen long after this PEP is accepted.
   Consequently, for a few years core devs should do some research before
   changing underscored API, especially if it doesn't need ``Py_BUILD_CORE``.

Users of the C API are encouraged to search their codebase for ``_Py`` and
``_PY`` identifier prefixes, and treat any hits as issues to be eventually
fixed -- either by switching to an existing alternative, or by opening
a CPython issue to request exposing public API for their use case,
and eventually switching to that.


.. _pep689-initial-list:

Initial unstable API
--------------------

The following API will be moved to the Unstable tier in the initial
implementation as proof of the concept.

Code object constructors:

- ``PyUnstable_Code_New()`` (renamed from ``PyCode_New``)
- ``PyUnstable_Code_NewWithPosOnlyArgs()`` (renamed from ``PyCode_NewWithPosOnlyArgs``)

Frame evaluation API (:pep:`523`):

- ``PyUnstable_FrameEvalFunction`` (renamed from ``_PyFrameEvalFunction``)
- ``PyUnstable_InterpreterState_GetEvalFrameFunc()`` (renamed from ``_PyInterpreterState_GetEvalFrameFunc``)
- ``PyUnstable_InterpreterState_SetEvalFrameFunc()`` (renamed from ``_PyInterpreterState_SetEvalFrameFunc``)
- ``PyUnstable_Eval_RequestCodeExtraIndex()`` (renamed from ``_PyEval_RequestCodeExtraIndex``)
- ``PyUnstable_Code_GetExtra()`` (renamed from ``_PyCode_GetExtra``)
- ``PyUnstable_Code_SetExtra()`` (renamed from ``_PyCode_SetExtra``)
- ``PyUnstable_InterpreterFrame`` (typedef for ``_PyInterpreterFrame``, as an opaque struct)
- ``PyUnstable_Frame_GetFrameObject`` (renamed from ``_PyFrame_GetFrameObject``)
- ``PyUnstable_EvalFrameDefault``
  (new function that calls ``_PyEval_EvalFrameDefault``, but takes
  ``PyFrameObject`` rather than ``_PyInterpreterFrame``)


Backwards Compatibility
=======================

The C API backwards compatibility expectations will be made clearer.

All renamed API will be available under old names for as long as feasible.


How to Teach This
=================

The changes affect advanced C programmers, who should consult the
updated reference documentation, devguide and/or What's New document.


Reference Implementation
========================

https://github.com/python/cpython/issues/91744


Open Issues
===========

With the ``PyUnstable_`` prefix, is the opt-in macro necessary?


Rejected Ideas
==============

No special prefix
-----------------

In the initial version of this PEP, unstable API didn't have the ``PyUnstable``
prefix.
Instead, defining ``Py_USING_UNSTABLE_API`` made the API available in a given
source file, signifying acknowledgement that the file as a whole will
potentially need to be revisited for each Python release.

However, it was decided that unstable-ness needs to be exposed
in the individual names.

Underscore prefix
-----------------

It would be possible to mark both private and unstable API with
leading underscores.
However, that would dilute the meaning of ``_Py`` prefix.
Reserving the prefix for internal API only makes it trivial to search for.


Python API
----------

It might be good to add a similar tier in the Python (not C) API,
e.g. for ``types.CodeType``.
However, the mechanism for that would need to be different.
This is outside the scope of the PEP.


Copyright
=========

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.
4 Likes

Based on this summary, I like the PyUnstable_ prefix. It’s clear and its length discourages casual use, which is just what we want. I have no opinion on Py_USING_UNSTABLE_API – feature macros in general tend to confound me, so if I don’t have to use this I’ll be happy.

3 Likes

I agree it does not seem to make sense to mandate the opt-in macro.

Regardless, the PEP looks generally good to me.

The backward compatibility section says:

All renamed API will be available under old names for as long as feasible.

What about removing old names (ex: _PyCode_GetExtra()) and provide a compatibility layer in the pythoncapi-compat project (ex: provide PyUnstable_Code_GetExtra() on Python 3.11 and older)? It would clarify that private functions must not be used, functions with the _Py prefix.

If people are not forced to update their code to the new name, they will likely keep the old name until the alias is removed, which may never happen. In that case, the benefits of PEP 689 are less obvious to me. People will still have to look into the doc of each function to check if it’s really a “private” function or an “unstable” function.

I expect that there is a low number of projects using functions of PEP 689. On top PyPI 5000 projects, 15 projects mention these functions:

$ ./search_pypi_top.py PYPI-2020-09-01/ '\b(PyCode_New|PyCode_NewWithPosOnlyArgs|_PyFrameEvalFunction|_PyInterpreterState_GetEvalFrameFunc|_PyInterpreterState_SetEvalFrameFunc|_PyEval_RequestCodeExtraIndex|_PyCode_GetExtra|_PyCode_SetExtra|_PyInterpreterFrame|_PyFrame_GetFrameObject|_PyEval_EvalFrameDefault)\b' -o pep689 -q
(...)
Found 108 matching lines in 14 projects

In fact, only 7 projects are directly affected:

IMO it’s manageable to update 7 projects in 1 year (until Python 3.12 is released).

For Cython, only some specific functions are affected PEP 689 (ex: profiling), not all projects using Cython are affected. And affected projects will likely have to re-generate Cython anyway for Python 3.12, for other reasons.

5 projects mention the functions in Valgrind suppression files:

  • onnx-1.12.0:
  • onnxoptimizer-0.3.1
  • onnx-simplifier-0.4.8
  • osmium-3.4.1
  • tweedledum-1.1.1

I ignored pydevd-2.8.0 and uvloop-0.16.0 (2 projects): function names only mentioned in comments.

The unstable tier doesn’t guarantee that functions will be changed with every minor release. It merely says tat they may change.
If/until they change, there is no need to update the calling code. That would just be churn for no benefit at all.

No they don’t. The old names will emit deprecation warnings.

The _PyInterpreterFrame structure is a bit special: it’s only used as struct _PyInterpreterFrame * in header files. In this case, the structure (members) doesn’t have to be defined (C compilers are fine with it, if you don’t have structure members). Do you have to rename it to PyUnstable_InterpreterFrame? I would prefer to keep _PyInterpreterFrame name in internals. Technically, we can keep the internal struct _PyInterpreterFrame* type in the declaration of the PyUnstable_FrameEvalFunction() type, no?

Yes, of course the internal name can stay as it is.

The removal of old names is not part of PEP 689 plan and will be handled separately (later)? PEP 689 currently says:

Per Python’s backwards compatibility policy (PEP 387), this deprecation needs to last at least two releases (without an SC exceptions). But it can also last indefinitely (…)

Right before that, it says:

The old name should be deprecated (e.g. with Py_DEPRECATED), but continue to be available until an incompatible change is made to the API.

So, if the API never changes, the old name will stay forever.
And that’s OK. An alias for a function that never changes is not a significant maintenance burden.

A new version of the PEP is up.

Changes (PR 2901):

  • The prefix is very visible, so:
    • No opt-in macro
    • No new Include/ subdirectory
  • Mention that unstable API should be documented and tested.
  • Removed frame evaluation API from the initial implementation. (I haven’t found a way to use it, so I can’t quite test it. IMO it needs more discussion/work to move to Unstable.)
  • Added a link to draft implementation (CPython part, I plan to write devguide PR only if the PEP is accepted).

If anything looks off, let me know! I plan to submit this version to the SC.

3 Likes

I’ve submitted it to the SC.

4 Likes

Oh, nice, I like the updated PEP 689. No opt-in macro and an obvious PyUnstable_ prefix sounds like a good trade-off: +1 for PEP 689 :slight_smile:

It will be good to have a single definition of “unstable API” in the documentation and clearly explain which backward compatibility warranties are provided or not. Currently, it’s hard to distinguish the unstable API from the private API “please don’t use it”.

I also agree that moving away from _Py is a good thing to only reserve _Py to the internal C API. Right now, they are still a few _Py names in the Public and CPython API, sometimes they are implementation details (like _PyObject_CAST() used by macros), sometimes they are legacy kept around for backward compatibility (like _PyObject_Vectorcall() alias to PyObject_Vectorcall()).

For me, it’s good that _Py remains a warning: “don’t use it or it will bite you soon!” :slight_smile:

4 Likes

The SC (for the record, the new one without Petr :slight_smile: ) accepts the current version of PEP 689. Thanks, Petr and everyone else for your input and your hard work.

6 Likes

I’m not actually sure where the best spot to comment this is (across the various github issues and this thread…), but I’d just like to add the AST C-API was very useful for us. We use it to execute simple python expressions without going through the Python interpreter. This gives us a large speedup (50% on Python 3.9, not sure about 3.11 yet) for some simple cases. We then use the Python interpreter for anything that we don’t handle.

For example an end user references a cell in a table in our UI via

op('table')[0, 0]

We can bypass the interpreter and stay entirely within our own code for that simple case.
Thanks

1 Like

Issue 101101 has a big list, and it mentions something about AST.
I’m not familiar with AST API, so I won’t be adding it myself. Do you want to help expose it? I guess a list of the needed functions would be a good first step, assuming they already exist internally in CPython.

1 Like

I’m only familiar with a small portion of it, but for my usage cases I need _PyParser_ASTFromString() to be exposed, as well as the contents of pycore_pyarena.h and pycore_ast.h. _PyParser_ASTFromFile() should be exposed as well.

Most of it seems simple, just exposing the functions to the public API. The only thing that I’d need guidance on is how to handle the struct/enum defines in pycore_ast.h. Should those be renamed before they are made public?

Most of it seems simple, just exposing the functions to the public API.

I’m afraid they’d need docs and tests as well, to hash out what they should do and what’s actually supported, and help avoid silent breakage.
(edit: Without that, there’s not much advantage over private API that can break at any time.)

Should those be renamed before they are made public?

Unstable API requires new names for everything, so, yes :‍)