PEP 684: A Per-Interpreter GIL

Hi all! I present here an updated PEP 684 for a second round of discussion.

In addition to any other feedback you might have to offer, here are (all but one of) the open issues from the PEP:

  1. What to do about the allocators?
  2. How would a per-interpreter tracemalloc module relate to global allocators?
  3. Would the faulthandler module be limited to the main interpreter
    (like the signal module) or would we leak that global state between
    interpreters (protected by a granular lock)?
  4. Does supporting multiple interpreters automatically mean an extension
    supports a per-interpreter GIL?
  5. What would be a better (scarier-sounding) name
    for allow_all_extensions?

Questions (1) and (4) are probably the most significant. However, I welcome all feedback and appreciate the time you might take to discuss this proposal with me!

Regarding (1), please refer to the corresponding section of the PEP: I pick up (4) in a reply further down in this thread.

PEP 684 text
PEP: 684
Title: A Per-Interpreter GIL
Author: Eric Snow <>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Requires: 683
Created: 08-Mar-2022
Python-Version: 3.12
Post-History: `08-Mar-2022 <>`__


Since Python 1.5 (1997), CPython users can run multiple interpreters
in the same process.  However, interpreters in the same process
have always shared a significant
amount of global state.  This is a source of bugs, with a growing
impact as more and more people use the feature.  Furthermore,
sufficient isolation would facilitate true multi-core parallelism,
where interpreters no longer share the GIL.  The changes outlined in
this proposal will result in that level of interpreter isolation.

High-Level Summary

At a high level, this proposal changes CPython in the following ways:

* stops sharing the GIL between interpreters, given sufficient isolation
* adds several new interpreter config options for isolation settings
* adds some public C-API for fine-grained control when creating interpreters
* keeps incompatible extensions from causing problems


The GIL protects concurrent access to most of CPython's runtime state.
So all that GIL-protected global state must move to each interpreter
before the GIL can.

(In a handful of cases, other mechanisms can be used to ensure
thread-safe sharing instead, such as locks or "immortal" objects.)

CPython Runtime State

Properly isolating interpreters requires that most of CPython's
runtime state be stored in the ``PyInterpreterState`` struct.  Currently,
only a portion of it is; the rest is found either in global variables
or in ``_PyRuntimeState``.  Most of that will have to be moved.

This directly coincides with an ongoing effort (of many years) to greatly
reduce internal use of C global variables and consolidate the runtime
state into ``_PyRuntimeState`` and ``PyInterpreterState``.
(See `Consolidating Runtime Global State`_ below.)  That project has
`significant merit on its own <Benefits to Consolidation_>`_
and has faced little controversy.  So, while a per-interpreter GIL
relies on the completion of that effort, that project should not be
considered a part of this proposal--only a dependency.

Other Isolation Considerations

CPython's interpreters must be strictly isolated from each other, with
few exceptions.  To a large extent they already are.  Each interpreter
has its own copy of all modules, classes, functions, and variables.
The CPython C-API docs `explain further <caveats_>`_.

.. _caveats:

However, aside from what has already been mentioned (e.g. the GIL),
there are a couple of ways in which interpreters still share some state.

First of all, some process-global resources (e.g. memory,
file descriptors, environment variables) are shared.  There are no
plans to change this.

Second, some isolation is faulty due to bugs or implementations that
did not take multiple interpreters into account.  This includes
CPython's runtime and the stdlib, as well as extension modules that
rely on global variables.  Bugs should be opened in these cases,
as some already have been.

Depending on Immortal Objects

:pep:`683` introduces immortal objects as a CPython-internal feature.
With immortal objects, we can share any otherwise immutable global
objects between all interpreters.  Consequently, this PEP does not
need to address how to deal with the various objects
`exposed in the public C-API <capi objects_>`_.
It also simplifies the question of what to do about the builtin
static types.  (See `Global Objects`_ below.)

Both issues have alternate solutions, but everything is simpler with
immortal objects.  If PEP 683 is not accepted then this one will be
updated with the alternatives.  This lets us reduce noise in this


The fundamental problem we're solving here is a lack of true multi-core
parallelism (for Python code) in the CPython runtime.  The GIL is the
cause.  While it usually isn't a problem in practice, at the very least
it makes Python's multi-core story murky, which makes the GIL
a consistent distraction.

Isolated interpreters are also an effective mechanism to support
certain concurrency models.  :pep:`554` discusses this in more detail.

Indirect Benefits

Most of the effort needed for a per-interpreter GIL has benefits that
make those tasks worth doing anyway:

* makes multiple-interpreter behavior more reliable
* has led to fixes for long-standing runtime bugs that otherwise
  hadn't been prioritized
* has been exposing (and inspiring fixes for) previously unknown runtime bugs
* has driven cleaner runtime initialization (:pep:`432`, :pep:`587`)
* has driven cleaner and more complete runtime finalization
* led to structural layering of the C-API (e.g. ``Include/internal``)
* also see `Benefits to Consolidation`_ below

.. XXX Add links to example GitHub issues?

Furthermore, much of that work benefits other CPython-related projects:

* performance improvements ("`faster-cpython`_")
* pre-fork application deployment (e.g. `Instagram server`_)
* extension module isolation (see :pep:`630`, etc.)
* embedding CPython

.. _faster-cpython:

.. _Instagram server:

Existing Use of Multiple Interpreters

The C-API for multiple interpreters has been used for many years.
However, until relatively recently the feature wasn't widely known,
nor extensively used (with the exception of mod_wsgi).

In the last few years use of multiple interpreters has been increasing.
Here are some of the public projects using the feature currently:

* `mod_wsgi <>`_
* `OpenStack Ceph <>`_
* `JEP <>`_
* `Kodi <>`_

Note that, with :pep:`554`, multiple interpreter usage would likely
grow significantly (via Python code rather than the C-API).

PEP 554 (Multiple Interpreters in the Stdlib)

:pep:`554` is strictly about providing a minimal stdlib module
to give users access to multiple interpreters from Python code.
In fact, it specifically avoids proposing any changes related to
the GIL.  Consider, however, that users of that module would benefit
from a per-interpreter GIL, which makes PEP 554 more appealing.


During initial investigations in 2014, a variety of possible solutions
for multi-core Python were explored, but each had its drawbacks
without simple solutions:

* the existing practice of releasing the GIL in extension modules

  * doesn't help with Python code

* other Python implementations (e.g. Jython, IronPython)

  * CPython dominates the community

* remove the GIL (e.g. gilectomy, "no-gil")

  * too much technical risk (at the time)

* Trent Nelson's "PyParallel" project

  * incomplete; Windows-only at the time

* ``multiprocessing``

  * too much work to make it effective enough;
    high penalties in some situations (at large scale, Windows)

* other parallelism tools (e.g. dask, ray, MPI)

  * not a fit for the stdlib

* give up on multi-core (e.g. async, do nothing)

  * this can only end in tears

Even in 2014, it was fairly clear that a solution using isolated
interpreters did not have a high level of technical risk and that
most of the work was worth doing anyway.
(The downside was the volume of work to be done.)


As `summarized above <High-Level Summary_>`__, this proposal involves the
following changes, in the order they must happen:

1. `consolidate global runtime state <Consolidating Runtime Global State_>`_
   (including objects) into ``_PyRuntimeState``
2. move nearly all of the state down into ``PyInterpreterState``
3. finally, move the GIL down into ``PyInterpreterState``
4. everything else

   * add to the public C-API
   * implement restrictions in ``ExtensionFileLoader``
   * work with popular extension maintainers to help
     with multi-interpreter support

Per-Interpreter State

The following runtime state will be moved to ``PyInterpreterState``:

* all global objects that are not safely shareable (fully immutable)
* the GIL
* most mutable data that's currently protected by the GIL
* mutable data that's currently protected by some other per-interpreter lock
* mutable data that may be used independently in different interpreters
  (also applies to extension modules, including those with multi-phase init)
* all other mutable data not otherwise excluded below

Furthermore, a portion of the full global state has already been
moved to the interpreter, including GC, warnings, and atexit hooks.

The following runtime state will not be moved:

* global objects that are safely shareable, if any
* immutable data, often ``const``
* effectively immutable data (treated as immutable), for example:

  * some state is initialized early and never modified again
  * hashes for strings (``PyUnicodeObject``) are idempotently calculated
    when first needed and then cached

* all data that is guaranteed to be modified exclusively in the main thread,

  * state used only in CPython's ``main()``
  * the REPL's state
  * data modified only during runtime init (effectively immutable afterward)

* mutable data that's protected by some global lock (other than the GIL)
* global state in atomic variables
* mutable global state that can be changed (sensibly) to atomic variables

Memory Allocators

This is the highest risk part of the work to isolate interpreters
and may require more than just moving fields down
from ``_PyRuntimeState``.

CPython provides a memory management C-API, with `three allocator domains`_:
"raw", "mem", and "object".  Each provides the equivalent of ``malloc()``,
``calloc()``, ``realloc()``, and ``free()``.  A custom allocator for each
domain can be set during runtime initialization and the current allocator
can be wrapped with a hook using the same API (for example, the stdlib
tracemalloc module).  The allocators are currently runtime-global,
shared by all interpreters.

.. _three allocator domains:

The "raw" allocator is expected to be thread-safe and defaults to glibc's
allocator (``malloc()``, etc.).  However, the "mem" and "object" allocators
are not expected to be thread-safe and currently may rely on the GIL for
thread-safety.  This is partly because the default allocator for both,
AKA "pyobject", `is not thread-safe`_.  This is due to how all state for
that allocator is stored in C global variables.
(See ``Objects/obmalloc.c``.)

.. _is not thread-safe:

Thus we come back to the question of isolating runtime state.  In order
for interpreters to stop sharing the GIL, allocator thread-safety
must be addressed.  If interpreters continue sharing the allocators
then we need some other way to get thread-safety.  Otherwise interpreters
must stop sharing the allocators.  In both cases there are a number of
possible solutions, each with potential downsides.

To keep sharing the allocators, the simplest solution is to use
a granular runtime-global lock around the calls to the "mem" and "object"
allocators in ``PyMem_Malloc()``, ``PyObject_Malloc()``, etc.  This would
impact performance, but there are some ways to mitigate that (e.g. only
start locking once the first subinterpreter is created).

Another way to keep sharing the allocators is to require that the "mem"
and "object" allocators be thread-safe.  This would mean we'd have to
make the pyobject allocator implementation thread-safe.  That could
even involve re-implementing it using an extensible allocator like
mimalloc.  The potential downside is in the cost to re-implement
the allocator and the risk of defects inherent to such an endeavor.

Regardless, a switch to requiring thread-safe allocators would impact
anyone that embeds CPython and currently sets a thread-unsafe allocator.
We'd need to consider who might be affected and how we reduce any
negative impact (e.g. add a basic C-API to help make an allocator

If we did stop sharing the allocators between interpreters, we'd have
to do so only for the "mem" and "object" allocators.  We might also need
to keep a full set of global allocators for certain runtime-level usage.
There would be some performance penalty due to looking up the current
interpreter and then pointer indirection to get the allocators.
Embedders would also likely have to provide a new allocator context
for each interpreter.  On the plus side, allocator hooks (e.g. tracemalloc)
would not be affected.

This is an open issue for which this proposal has not settled
on a solution.

.. _proposed capi:


Internally, the interpreter state will now track how the import system
should handle extension modules which do not support use with multiple
interpreters.  See `Restricting Extension Modules`_ below.  We'll refer
to that setting here as "PyInterpreterState.strict_extensions".

The following public API will be added:

* ``PyInterpreterConfig`` (struct)
* ``PyInterpreterConfig_LEGACY_INIT`` (macro)
* ``PyInterpreterConfig_INIT`` (macro)
* ``PyThreadState * Py_NewInterpreterEx(PyInterpreterConfig *)``
* ``bool PyInterpreterState_GetStrictExtensions(PyInterpreterState *)``
* ``void PyInterpreterState_SetStrictExtensions(PyInterpreterState *, bool)``

A note about the "main" interpreter:

Below, we mention the "main" interpreter several times.  This refers
to the interpreter created during runtime initialization, for which
the initial ``PyThreadState`` corresponds to the process's main thread.
It is has a number of unique responsibilities (e.g. handling signals),
as well as a special role during runtime initialization/finalization.
It is also usually (for now) the only interpreter.
(Also see


This is a struct with 4 bool fields, effectively::

    typedef struct {
        /* Allow forking the process. */
        unsigned int allow_fork_without_exec;
        /* Allow daemon threads. */
        unsigned int allow_daemon_threads;
        /* Use a unique "global" interpreter lock.
           Otherwise, use the main interpreter's GIL. */
        unsigned int own_gil;
        /* Only allow extension modules that support
           use in multiple interpreters. */
        unsigned int strict_extensions;
    } PyInterpreterConfig;

The first two fields are essentially derived from the existing
``PyConfig._isolated_interpreter`` field.

``PyInterpreterConfig.strict_extensions`` is basically the initial
value used for "PyInterpreterState.strict_extensions".

We may add other fields, as needed, over time
(e.g. possibly "allow_subprocess", "allow_threading", "own_initial_thread").

Note that a similar ``_PyInterpreterConfig`` may already exist internally,
with similar fields.
(See `issue #91120 <>`__
and `PR #31771 <>`__.)
If it does exist then ``PyInterpreterConfig`` would replace it.


If ``true`` then the new interpreter will have its own "global"
interpreter lock.  This means the new interpreter can run without
getting interrupted by other interpreters.  This effectively unblocks
full use of multiple cores.  That is the fundamental goal of this PEP.

If ``false`` then the new interpreter will use the main interpreter's
lock.  This is the legacy (pre-3.12)  behavior in CPython, where all
interpreters share a single GIL.  Sharing the GIL like this may be
desirable when using extension modules that still depend on
the GIL for thread safety.

PyInterpreterConfig Initializer Macros

``#define PyInterpreterConfig_LEGACY_INIT {1, 1, 0, 0}``

This initializer matches the behavior of ``Py_NewInterpreter()``.
The main interpreter uses this.

``#define PyInterpreterConfig_INIT {0, 0, 1, 1}``

This initializer would be used to get an isolated interpreter that
also avoids subinterpreter-unfriendly features.  It would be the default
for interpreters created through :pep:`554`.  Fork (without exec) would
be disabled by default due to the general problems of mixing threads
with fork, coupled with the role of the main interpreter in the runtime
lifecycle.  Daemon threads would be disabled due to their poor interaction
with interpreter finalization.

New API Functions

``PyThreadState * Py_NewInterpreterEx(PyInterpreterConfig *)``

This is like ``Py_NewInterpreter()`` but initializes uses the granular
config.  It will replace the "private" func ``_Py_NewInterpreter()``.

``bool PyInterpreter_GetStrictExtensions(PyInterpreterState *)``
``void PyInterpreter_SetStrictExtensions(PyInterpreterState *, bool)``

Respectively, these get/set "PyInterpreterState.strict_extensions".

Restricting Extension Modules

Extension modules have many of the same problems as the runtime when
state is stored in global variables.  :pep:`630` covers all the details
of what extensions must do to support isolation, and thus safely run in
multiple interpreters at once.  This includes dealing with their globals.

If an extension implements multi-phase init (see :pep:`489`) it is
considered compatible with multiple interpreters.  All other extensions
are considered incompatible.  This position is based on the premise that
if a module supports use with multiple interpreters then it necessarily
will work even if interpreters do not share the GIL.  This position is
still the subject of debate.

If an incompatible extension is imported and the current
"PyInterpreterState.strict_extensions" value is ``true`` then the import
system will raise ``ImportError``.  (For ``false`` it simply doesn't check.)
This will be done through

Such imports will never fail in the main interpreter (or in interpreters
created through ``Py_NewInterpreter()``) since
"PyInterpreterState.strict_extensions" initializes to ``false`` in both
cases.  Thus the legacy (pre-3.12) behavior is preserved.

We will work with popular extensions to help them support use in
multiple interpreters.  This may involve adding to CPython's public C-API,
which we will address on a case-by-case basis.

Extension Module Compatibility

As noted in `Extension Modules`_, many extensions work fine in multiple
interpreters without needing any changes.  The import system will still
fail if such a module doesn't explicitly indicate support.  At first,
not many extension modules will, so this is a potential source
of frustration.

We will address this by adding a context manager to temporarily disable
the check on multiple interpreter support:
``importlib.util.allow_all_extensions()``.  More or less, it will modify
the current "PyInterpreterState.strict_extensions" value (e.g. through
a private ``sys`` function).


The "Sub-interpreter support" section of ``Doc/c-api/init.rst`` will be
updated with the added API.


Backwards Compatibility

No behavior or APIs are intended to change due to this proposal,
with one exception noted in `the next section <Extension Modules_>`_.
The existing C-API for managing interpreters will preserve its current
behavior, with new behavior exposed through new API.  No other API
or runtime behavior is meant to change, including compatibility with
the stable ABI.

See `Objects Exposed in the C-API`_ below for related discussion.

Extension Modules

Currently the most common usage of Python, by far, is with the main
interpreter running by itself.  This proposal has zero impact on
extension modules in that scenario.  Likewise, for better or worse,
there is no change in behavior under multiple interpreters created
using the existing ``Py_NewInterpreter()``.

Keep in mind that some extensions already break when used in multiple
interpreters, due to keeping module state in global variables.  They
may crash or, worse, experience inconsistent behavior.  That was part
of the motivation for :pep:`630` and friends, so this is not a new
situation nor a consequence of this proposal.

In contrast, when the `proposed API <proposed capi_>`_ is used to
create multiple interpreters, the default behavior will change for
some extensions.  In that case, importing an extension will fail
(outside the main interpreter) if it doesn't indicate support for
multiple interpreters.  For extensions that already break in
multiple interpreters, this will be an improvement.

Now we get to the break in compatibility mentioned above.  Some
extensions are safe under multiple interpreters, even though they
haven't indicated that.  Unfortunately, there is no reliable way for
the import system to infer that such an extension is safe, so
importing them will still fail.  This case is addressed in
`Extension Module Compatibility`_ below.

Extension Module Maintainers

One related consideration is that a per-interpreter GIL will likely
drive increased use of multiple interpreters, particularly if :pep:`554`
is accepted.  Some maintainers of large extension modules have expressed
concern about the increased burden they anticipate due to increased
use of multiple interpreters.

Specifically, enabling support for multiple interpreters will require
substantial work for some extension modules.  To add that support,
the maintainer(s) of such a module (often volunteers) would have to
set aside their normal priorities and interests to focus on
compatibility (see :pep:`630`).

Of course, extension maintainers are free to not add support for use
in multiple interpreters.  However, users will increasingly demand
such support, especially if the feature grows
in popularity.

Either way, the situation can be stressful for maintainers of such
extensions, particularly when they are doing the work in their spare
time.  The concerns they have expressed are understandable, and we address
the partial solution in `Restricting Extension Modules`_ below.

Alternate Python Implementations

Other Python implementation are not required to provide support for
multiple interpreters in the same process (though some do already).

Security Implications

There is no known impact to security with this proposal.


On the one hand, this proposal has already motivated a number of
improvements that make CPython *more* maintainable.  That is expected
to continue.  On the other hand, the underlying work has already
exposed various pre-existing defects in the runtime that have had
to be fixed.  That is also expected to continue as multiple interpreters
receive more use.  Otherwise, there shouldn't be a significant impact
on maintainability, so the net effect should be positive.


The work to consolidate globals has already provided a number of
improvements to CPython's performance, both speeding it up and using
less memory, and this should continue. Performance benefits to a
per-interpreter GIL have not been explored.  At the very least, it is
not expected to make CPython slower (as long as interpreters are
sufficiently isolated).

How to Teach This

This is an advanced feature for users of the C-API.  There is no
expectation that this will be taught.

That said, if it were taught then it would boil down to the following:

    In addition to Py_NewInterpreter(), you can use Py_NewInterpreterEx()
    to create an interpreter.  The config you pass it indicates how you
    want that interpreter to behave.

.. XXX We should add docs (a la PEP 630) that spell out how to make
   an extension compatible with per-interpreter GIL.

Reference Implementation


Open Issues

* What to do about the allocators?
* How would a per-interpreter tracemalloc module relate to global allocators?
* Would the faulthandler module be limited to the main interpreter
  (like the signal module) or would we leak that global state between
  interpreters (protected by a granular lock)?
* Split out an informational PEP with all the relevant info,
  based on the "Consolidating Runtime Global State" section?
* Does supporting multiple interpreters automatically mean an extension
  supports a per-interpreter GIL?
* What would be a better (scarier-sounding) name
  for ``allow_all_extensions``?

Deferred Functionality

* ``PyInterpreterConfig`` option to always run the interpreter in a new thread
* ``PyInterpreterConfig`` option to assign a "main" thread to the interpreter
  and only run in that thread

Rejected Ideas


Extra Context

Sharing Global Objects

We are sharing some global objects between interpreters.
This is an implementation detail and relates more to
`globals consolidation <Consolidating Runtime Global State>`_
than to this proposal, but it is a significant enough detail
to explain here.

The alternative is to share no objects between interpreters, ever.
To accomplish that, we'd have to sort out the fate of all our static
types, as well as deal with compatibility issues for the many objects
`exposed in the public C-API <capi objects_>`_.

That approach introduces a meaningful amount of extra complexity
and higher risk, though prototyping has demonstrated valid solutions.
Also, it would likely result in a performance penalty.

`Immortal objects <Depending on Immortal Objects_>`_ allow us to
share the otherwise immutable global objects.  That way we avoid
the extra costs.

.. _capi objects:

Objects Exposed in the C-API

The C-API (including the limited API) exposes all the builtin types,
including the builtin exceptions, as well as the builtin singletons.
The exceptions are exposed as ``PyObject *`` but the rest are exposed
as the static values rather than pointers.  This was one of the few
non-trivial problems we had to solve for per-interpreter GIL.

With immortal objects this is a non-issue.

Consolidating Runtime Global State

As noted in `CPython Runtime State`_ above, there is an active effort
(separate from this PEP) to consolidate CPython's global state into the
``_PyRuntimeState`` struct.  Nearly all the work involves moving that
state from global variables.  The project is particularly relevant to
this proposal, so below is some extra detail.

Benefits to Consolidation

Consolidating the globals has a variety of benefits:

* greatly reduces the number of C globals (best practice for C code)
* the move draws attention to runtime state that is unstable or broken
* encourages more consistency in how runtime state is used
* makes it easier to discover/identify CPython's runtime state
* makes it easier to statically allocate runtime state in a consistent way
* better memory locality for runtime state

Furthermore all the benefits listed in `Indirect Benefits`_ above also
apply here, and the same projects listed there benefit.

Scale of Work

The number of global variables to be moved is large enough to matter,
but most are Python objects that can be dealt with in large groups
(like ``Py_IDENTIFIER``).  In nearly all cases, moving these globals
to the interpreter is highly mechanical.  That doesn't require
cleverness but instead requires someone to put in the time.

State To Be Moved

The remaining global variables can be categorized as follows:

* global objects

  * static types (incl. exception types)
  * non-static types (incl. heap types, structseq types)
  * singletons (static)
  * singletons (initialized once)
  * cached objects

* non-objects

  * will not (or unlikely to) change after init
  * only used in the main thread
  * initialized lazily
  * pre-allocated buffers
  * state

Those globals are spread between the core runtime, the builtin modules,
and the stdlib extension modules.

For a breakdown of the remaining globals, run:

.. code-block:: bash

    ./python Tools/c-analyzer/ Tools/c-analyzer/cpython/globals-to-fix.tsv

Already Completed Work

As mentioned, this work has been going on for many years.  Here are some
of the things that have already been done:

* cleanup of runtime initialization (see :pep:`432` / :pep:`587`)
* extension module isolation machinery (see :pep:`384` / :pep:`3121` / :pep:`489`)
* isolation for many builtin modules
* isolation for many stdlib extension modules
* addition of ``_PyRuntimeState``
* no more ``_Py_IDENTIFIER()``
* statically allocated:

  * empty string
  * string literals
  * identifiers
  * latin-1 strings
  * length-1 bytes
  * empty tuple


As already indicated, there are several tools to help identify the
globals and reason about them.

* ``Tools/c-analyzer/cpython/globals-to-fix.tsv`` - the list of remaining globals
* ``Tools/c-analyzer/``

  * ``analyze`` - identify all the globals
  * ``check`` - fail if there are any unsupported globals that aren't ignored

* ``Tools/c-analyzer/`` - summarize the known globals

Also, the check for unsupported globals is incorporated into CI so that
no new globals are accidentally added.

Global Objects

Global objects that are safe to be shared (without a GIL) between
interpreters can stay on ``_PyRuntimeState``.  Not only must the object
be effectively immutable (e.g. singletons, strings), but not even the
refcount can change for it to be safe.  Immortality (:pep:`683`)
provides that.  (The alternative is that no objects are shared, which
adds significant complexity to the solution, particularly for the
objects `exposed in the public C-API <capi objects_>`_.)

Builtin static types are a special case of global objects that will be
shared.  They are effectively immutable except for one part:
``__subclasses__`` (AKA ``tp_subclasses``).  We expect that nothing
else on a builtin type will change, even the content
of ``__dict__`` (AKA ``tp_dict``).

``__subclasses__`` for the builtin types will be dealt with by making
it a getter that does a lookup on the current ``PyInterpreterState``
for that type.



* :pep:`384` "Defining a Stable ABI"
* :pep:`432` "Restructuring the CPython startup sequence"
* :pep:`489` "Multi-phase extension module initialization"
* :pep:`554` "Multiple Interpreters in the Stdlib"
* :pep:`573` "Module State Access from C Extension Methods"
* :pep:`587` "Python Initialization Configuration"
* :pep:`630` "Isolating Extension Modules"
* :pep:`683` "Immortal Objects, Using a Fixed Refcount"
* :pep:`3121` "Extension Module Initialization and Finalization"


This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.

I just thought of something. I believe when we saw Sam Gross’s nogil presentations (a year ago, and again at the Language Summit) we were pretty skeptical of how much will break in 3rd party extension modules like numpy, because they might have some bit of mutable global state (perhaps hidden deep inside some esoteric corner of the code) that is protected by the GIL. Wouldn’t such cases also be broken when multiple interpreters have their own GIL?

The hypothetical case would be something like

is_py_file(PyObject *obj)
    assert(obj != NULL);
    static PyObject *cache = NULL;
    static bool cache_is_py_file = 0;
    if (obj != cache) {
        cache = obj;
        cache_is_py_file = <expensive computation>;
    return cache_is_py_file;

Such code could be hidden in many places, and is currently protected by the GIL, even when using multiple interpreters.


There was one main point of discussion from the first round that didn’t seem to get resolved yet: supports-multiple-interpreters vs. supports-per-interpreter-gil. I figured it would be simplest to start that back up here on DPO.

Below I’ve included the last bit of conversation on the topic from the python-dev thread (already my reply to Peter so I won’t follow up). I’ve also included below a comment Petr left on most recent PEP 684 PR. I’ll post a reply to that separately.


> > Now we get to the break in compatibility mentioned above.  Some
> > extensions are safe under multiple interpreters, even though they
> > haven't indicated that.  Unfortunately, there is no reliable way for
> > the import system to infer that such an extension is safe, so
> > importing them will still fail.  This case is addressed in
> > `Extension Module Compatibility`_ below.
> Extensions that use multi-phase init should already be compatible with
> multiple interpreters. Multi-phase init itself is the flag that
> indicates this.

Correct and ExtensionFileLoader will use that if
PyInterpreterConfig.strict_extensions is set, regardless of whether or
not we need a second extension module indicator for
I-said-I-was-isolated-but-now-I-really-mean-it (per-interpreter GIL).

> But they might not be compatible with *per-interpreter GIL*. I don't
> like how that's conflated with multiple interpreters here.

Hmm, I suppose in my mind they *have* been the same thing. :)

> For example, extension modules can currently support multiple
> interpreters, but rely on the GIL to protect calls to a non-threadsafe
> library, access shared memory, etc. As an example, the PEP 630 “opt-out”
> is not thread-safe.

So, you are saying that some mutli-phase init extensions may still be
relying on the GIL as a lock for some shared state.  In the case of
your "opt-out" example, there is a possible (albeit super unlikely)
race on "loaded".  So such an extension needs to be able to separately
opt in to being used without a GIL between interpreters.  Is all that

Out of curiosity, do you have any examples of extensions that
implement multi-phase init but need to opt out (like in your example)?
 Is it only the case where the maintainer is in the process of
isolating the module, so the opt-out is temporary?

Aside from the unsafe-flag-to-indicate-not-isolated case, do you know
of any other examples where a module is safe for use between
interpreters but still relies on the shared GIL?  I'm struggling to
imagine such a scenario, but where they don't also opt out of
multi-interpreter support already.

FWIW, my assumption is that, if an extension has been made isolated
enough for use between multiple interpreters, then it is extremely
likely that it is also isolated enough to use without a GIL shared
between the interpreters.

> It seems to me that there should be a separate flag (slot) to indicate
> support for per-interpreter GIL, and the `strict_extensions` bit should
> work with that.

I think I see what you are saying.  My concern is that anything beyond
the default settings is an obstacle for extension maintainers, so
opt-in is especially painful if it ends up being the common case.  Of
course, it may be unavoidable in the end.

While we may end up needing a flag to indicate
yes-I'm-isolated-but-not-that-isolated, the following alternative came
to mind:

* multi-phase init extensions should be expected to be isolated, thus
compatible with use in multiple interpreters (already true)
* multi-phase init extensions should be expected to be fully isolated,
thus compatible with per-interpreter GIL
* there would be a new module def slot that a multi-phase init
extension can use to explicitly opt out
* the ExtensionFileLoader would enforce loading the module only once
in the process (and use a dedicated granular lock to prevent races)

Instead of using its own static "loaded" variable, the module in your
opt-out example would use this new slot.

To me, really-truly-fully-isolated is the sensible long-term default
for multi-phase init.  Most maintainers will implement multi-phase
init, using PEP 630 to get isolated enough.  We can avoid the extra
step for the common case.  So our future selves would be much happier
if we go with an explicit opt-out now. :)  This follows my earlier
assumption that few extensions will be safe in multi-interpreter but
not per-interpreter GIL.

FWIW, I was going to say perhaps we could get away with treating the
vast majority of extensions as already safe in multiple interpreters,
to avoid requiring extensions to implement multi-phase init.  However,
I can already think of a number of relatively common cases where that
isn't true (e.g. static types). :)

Plus, multi-phase init is such a good thing and doesn't require that
much effort (especially if we provide an opt-out slot for
multi-interpreter support).  Per-interpreter GIL would be a pretty
good carrot. :)


I sense a misunderstanding here :frowning:
Could you please post in the thread again, and point out what’s still unresolved? I don’t want to repeat too much of what’s in the thread, but in short, I think that:

  • Currently, there is no “I-said-I-was-isolated” flag, so it doesn’t make much sense to me to talk about adding an “I-said-I-was-isolated-but-now-I-really-mean-it” one. Multi-phase init modules are assumed to support multiple interpreters. Today, it is possible to support multiple interpreters well without making modules fully isolated.
  • Using the GIL as a general lock for calling non-thread-safe API is currently OK even with really-truly-fully-isolated modules. (My example of non-threadsafe API was perhaps too simple, but while holding the GIL for I/O is a bad idea, it does work as far as locking is concerned. Jim said that on that thread. Substitute any non-threadsafe API – say, inet_ntoa)

tl;dr There does seem to be a difference between supports-multiple-interpreters and supports-per-interpreter-gil, which directly correlates to the thread-safety of the internal global state of linked libraries. Impact: TBD. Solution: TBD

On 22-Sep-2022, @encukou said:

  • Currently, there is no “I-said-I-was-isolated” flag, so it doesn’t make much sense to me to talk about adding an “I-said-I-was-isolated-but-now-I-really-mean-it” one. Multi-phase init modules are assumed to support multiple interpreters. Today, it is possible to support multiple interpreters well without making modules fully isolated.

I’m not clear on how a module can support multiple interpreters without it being fully isolated. By “isolated” I’m talking about the extension module’s state. However, I do also consider the state of linked libraries to be, effectively, part of the module’s state. (Perhaps you don’t?) In what other ways might a module be isolated? Basically, what do you mean by “fully”?

Note that cryptography ran into the issue of process-global state in a linked library seven years ago. They had to deal with openssl’s internal state in a per-interpreter way. That’s independent of a per-interpreter GIL. To me that means you can’t say a module is compatible with multiple interpreters (“isolated”) without factoring in the process-global state of any libraries to which it links.

The challenge here is that a library with internal global state may not provide a way by which anyone else can manage that state. While we were able to solve this for extension modules themselves with “module state” (via PEP 3121), we don’t have that option when it comes to a linked library (e.g. openssl). If a library doesn’t have an API to manage that state then the extension is out of luck. It can maybe try to get a solution from upstream but in the meantime it can’t say it supports use in multiple interpreters. That’s likely an uncommon case, but, unfortunately, PEP 3121 (and subsequent PEPS) did not address this. Honestly, I expect no one really thought about it.

Regardless, this situation isn’t exclusive to per-interpreter GIL, as demonstrated by the cryptography-openssl case.

  • Using the GIL as a general lock for calling non-thread-safe API is currently OK even with really-truly-fully-isolated modules. (My example of non-threadsafe API was perhaps too simple, but while holding the GIL for I/O is a bad idea, it does work as far as locking is concerned. Jim said that on that thread. Substitute any non-threadsafe API – say, inet_ntoa)

First of all, thanks for clarifying. To be sure I understand, you’re saying that a module may rely on the GIL for thread-safety when accessing otherwise thread-unsafe resources (whether data races or reentrancy or otherwise). Using the inet_ntoa() example, its implementation may use a static buffer internally so a data race could lead to an invalid result. Currently, if two interpreters are running in parallel then the GIL keeps them from calling inet_ntoa() at the same time. If interpreters no longer share the GIL then the race condition may occur.

I think I understand now and agree this is something the PEP must address directly. While I understand, though, it still seems like this might be the same matter of isolating the global state of a linked library. (I’m not sure that it is the same thing either, but it kind of seems like it at the moment.)

The challenge here is the same as above, except slightly more problematic. A linked library either doesn’t have any internal global state or does a combination of the following:

  1. ensures its internal state is thread-safe
  2. provides an API to manage internal state (and expects the caller to deal with thread-safety)
  3. hides its internal state

(On top of that, as a subtle special case of that last one, there’s also a chance that a linked library follows a relatively common C convention for temporary data, using static variables for things like buffers. The above inet_ntoa() example falls in this category.)

Let’s say an extension were used in multiple interpreters that do not share a GIL. Here’s what they would have to do differently in each of those cases, respectively:

  1. nothing
  2. wrap those calls (and direct mutation of state) with a lock
  3. wrap all calls that use that state with a lock

Ideally, all linked libraries with internal global state would match (1). However, that’s not the status quo for a non-zero number of extension modules. The affected maintainers could encourage the upstream libraries to transition to (1), but in the meantime their extensions would need to add locks to support use in multiple interpreters (that don’t share a GIL).

So it looks like there is a slight distinction between supports-multiple-interpreters and supports-per-interpreter-gil. I’ll work on a clarification in the PEP and a solution. Suggestions are welcome, as well as pointing out anything I missed.

FWIW, I’ll point out that embedded uses of CPython may already have to deal with this, since the application may use the same linked library without holding the GIL. However, that’s a particularly narrow failure case.

These are all effectively process level concepts so being per-interpreter does not make sense to me. I’d go with whatever approach for each of them is the least amount of work and easiest to maintain.

I don’t see the utility in doing extra work to guarantee per-interpreter within a process granularity for allocators, tracemalloc, or faulthandler.


Extension modules that do not implement multi-phase init (PEP 489) will only be allowed in the main interpreter (by default). Implementing multi-phase init is a promise that the extension supports use in multiple interpreters. Per PEP 489:

Extensions using the new initialization scheme are expected to support subinterpreters and multiple Py_Initialize/Py_Finalize cycles correctly, avoiding the issues mentioned in Python documentation [6]. The mechanism is designed to make this easy, but care is still required on the part of the extension author. No user-defined functions, methods, or instances may leak to different interpreters. To achieve this, all module-level state should be kept in either the module dict, or in the module object’s storage reachable by PyModule_GetState. A simple rule of thumb is: Do not define any static data, except built-in types with no mutable or user-settable class attributes.

The question of if supports-multiple-interpreters is equivalent to supports-per-interpreter-gil is one we must settle as part of this PEP discussion. At this point I think they are the same when you consider an extension module’s state but actually slightly different when you factor in libraries the extension might link in.

Regardless, I was considering yesterday that it may be helpful to allow an extension to implement multi-phase init (e.g. partially/gradually) by indicating that it still can’t be used in multiple interpreters (i.e. its state isn’t fully isolated between interpreters). We’d provide a new moduledef slot (“Py_mod_not_isolated”?) for this.


That would be useful I think. Currently Cython implements PEP 489 but not full multi-interpreter safe storage. It would be possibly to disable PEP 489 imports in Cython but that does also break a few other Python-compatibility features (e.g. providing __file__ at module init time).

Quick follow-up to this - Cython has it’s own check that its modules aren’t running in multiple interpreters (which’ll raise an ImportError rather than cause mysterious crashes). So while I still think it’d be useful to have a way of indicating that modules can’t run in multiple interpreters, Cython doesn’t need it specifically.

Where a thread enters a C extension, has no prior association with any PyThreadState, and yet is directed to an object in a particular interpreter, how does it become clothed in a PyThreadState from the correct interpreter? This, I think, is already a problem with the implementation of multiple interpreters, but we may be largely saved by the lack of concurrency and the way (I think) such threads always end up in the main interpreter in PyGILState_Ensure().

As a concrete example, I would cite the callback in user-defined functions in sqlite3. The implementation is guarded as recommended, but what if create_function had been called, and the callback defined, in some interpreter other than the one now handling it?

In this case, the design allows application-specific data (callback_context) to accompany the definition, which then returns in the sqlite3_context object. We could stash the (id of the) interpreter there. It isn’t examined until after PyGILState_Ensure() currently but I don’t think that’s fundamental.

But does every such case have to find its own solution?

I’m turning over the idea, as a general approach, that each function (or object?) that is sensitive to its interpreter context should know explicitly which interpreter it belongs to. Or maybe it is in the f_globals or the containing module __dict__?

This seems to imply that Cython doesn’t support multiple interpreters at all? Is that so? If not, under what conditions does or doesn’t it support them? This would be important since so much of the scientific Python ecosystem is built on Cython.

Would things be different if there was a single GIL-free interpreter (like Sam Gross’s nogil project)?

If this is an OS thread that doesn’t “belong” to any interpreter (yet), in most cases that should probably be an error. The problem doesn’t seem to be dependent on whether each thread has a GIL or not – if there is more than one interpreter active, a thread created by C code outside of Python would have to figure out which interpreter it wants to belong to before it could do anything. I think we currently lean heavily in the direction of always defaulting to the main interpreter, even though that could already be incorrect.

In the new design basically all objects are sensitive to interpreter context, since you cannot touch any object’s refcount without holding the correct interpreter’s GIL. (The exceptions are trivial constants like None or False which will likely be immortal per PEP 683 and can be shared between all interpreters.)

It is an interesting idea to to consider a function’s globals as providing a pointer to the interpreter, but I think that’s too late – without the right interpreter’s GIL you cannot even safely follow a pointer from a function object to its globals let alone do a lookup in that globals dict.


That’s right. The current status is:

  • In the main compilation mode most globals (include extension type typeobjects) is defined as static C variables at the global scope rather than on a module object. I made a PR today that should improve that dramatically but it isn’t the end point. The next step after that involves accessing the module state from PyType_GetModule and similar. That’ll probably be an optional compile-time option since I imagine it’d have a speed cost.
  • In the limited API mode (which is much less tested/supported/complete) the globals is put on a module-state struct. That struct is looked up from PyState_FindModule which by nature implies a single interpreter. As far as I can tell that’s basically the only option with the limited API.

It is an aim to improve this, although I wouldn’t like to put much of a time-line on it.

I don’t think so but I’m not fully sure of the implications of that. Sam did patch Cython so that it works with his nogil project (although that doesn’t use any special features of the project, it just ensures that Cython modules compile and run with it)

Chiming in with some thought we had in HPy regarding “the question of if supports-multiple-interpreters is equivalent to supports-per-interpreter-gil”.

From out Numpy porting effort: not only global state of libraries is an issue, but also extensions may have, for example, some global cache that does not hold PyObject*, but results of some costly computation that is otherwise pure C. If I understand it correctly, with GIL, this just works fine even with sub-interpreters. Without GIL, you’d have to put your own lock around accesses to this cache or put the cache into module state.

All in all, I think that this discussion shows that the best would be to decouple these things: make them more explicit and more future proof. There can be other “execution modes” in the future, for example, the mentioned no-GIL (putting aside how realistic it is that it lands in main soon), will there be some “multi-phase module init with I-can-even-deal-with-no-gil” thing?

In HPy we want to split the initialization into something like “extension initialization”, where the extension would tell us what expectations it has (e.g., I need GIL, but per interpreter is OK), but without making any API calls yet. (For HPy specifically it would also tell us which HPyContext version the extension is compiled against). Once that is settled, we can actually call the module init (locking GIL if it tolds us so, for example, for HPy also using the right HPy ABI version).

I also think that the “extension initialization” API should be designed in a way that by default the Python engine cannot take any assumptions, i.e., no subinterpreters, GIL required, and it will take some assumptions only if some very explicit flag is set. If I, as an extension dev, have to do something like extension_info->supports_per_subinterpreter_gil = true and I am not sure what this subinterpreter GIL thing means, it should be clear that I better find out (and supports_per_subinterpreter_gil will be a good place to document that). On the other hand, when I start my new extension by copy-pasting some preexisting extension example that happens to use multi-phase module init, it does not tell me so clearly that there are some assumptions that I am communicating to Python by using multi-phase module init.

Well, I did think about it a bit; the HOWTO says, relatively vaguely:

In these cases, the Python module should provide access to the global state, rather than own it. If possible, write the module so that multiple copies of it can access the state independently (along with other libraries, whether for Python or other languages). If that is not possible, consider explicit locking.

Whether anyone actually does this correctly is another question. in the stdlib, we have readline which IMO should do this, but I was never involved with that module. (Reviewing it is too low on my TODO list, but, sadly, so low that I’m not likely to get to it.)

Also note that cryptography is a widely used module that gets tested in a lot of different use cases – like with mod_wsgi, which tends to expose lots of issues related to multiple interpreters.
Other modules don’t get that kind of testing, so I wouldn’t be surprised if many were subtly broken.

Yes, PEP 489 says that multi-phase init modules “are expected to support subinterpreters and multiple Py_Initialize/Py_Finalize cycles correctly”, but honestly, when it was written, no one knew what that means. For example, I thought that it’s OK to share objects like ints. We know now that even those must be per-interpreter (or possibly immortal), but for modules that were written in the mean time, interpreting PEP 489 as requiring “perfect” isolation feels like moving the goalposts – even if it is technically correct.

Despite what the spec says, it’s so hard to test isolation, so multi-phase init might actually not be a good indicator for “supporting subinterpreters” in general – not just for the GIL ‍:(

I guess there is a big-ish decision to make. Sub-interpreters & per-interpreter GIL won’t be perfect in 3.12, and we can choose between:

  • supporting all existing modules, but crashing/misbehaving in some race conditions, making the feature seem unstable, or
  • only supporting “updated” modules in subinterpreters with their own GIL, making the feature less useful.

Case in point:

What would this flag do?
Revert to the mechanism used for single-phase init (probably crashing/misbehaving in some cases), or make Python refuse to load the module in non-main interpreter (making multiple interpreters less useful)?
Or refuse to load the module more than once per process, which the HOWTO currrently recommends doing manually? (BTW, note how that recipe relies on a global GIL…)


Thanks for point this out (and about the benefit of a multiple-interpreters opt-out).

Would Cython defer to CPython doing the multiple-interpreters check? Is there something inherent to Cython that makes it incompatible with multiple interpreters?

UPDATE: you already answered this. :slight_smile:

Would Cython defer to CPython doing the multiple-interpreters check?

If there was an official mechanism to indicate (in)compatibility then we’d likely use it.

Is there something inherent to Cython that makes it incompatible with multiple interpreters?

I think this was what I already answered. However: I’m currently not sure whether we can support all of the current Cython feature-set with multiple interpreters. I’m mainly thinking about C functions that can access Python globals. (I’m also not sure this is something that Python can help with). So even in future when we support it properly it may be that some Cython modules will never be able to support multiple interpreters. But that’s very much a future problem…

Perhaps it is an error (in sqlite) to define a function in one interpreter and name it in a query submitted from another. I think there are environments where a new platform thread might invoke a call-back but I’m influenced by Java not C in that conviction.

Good point. The interpreter has to be identifiable in a thread-safe way before PyGILState_Ensure() completes. I wonder if only case-specific solutions can exist, in this case in the callback_context.

At this point I’m strongly leaning toward adding a moduledef slot for “supports use in multiple interpreters” (i.e. an opt-in flag). However, I don’t see the point of a distinct “supports per-interpreter GIL” slot since there doesn’t seem to be much interest for one without the other currently.

Contrary to what PEP 489 says, the default would be “does not support use in multiple interpreters”. Ideally the opposite would be the default, but it seems like there are enough extensions out there that would be a problem, even among those that implement multi-phase init.

That said, I expect we could switch the default at some point in the future. With that in mind, it would make sense to add an explicit “does not support use in multiple interpreters” moduledef slot now (matching the current default).

I’ve updated PEP 684 after the last set of feedback. You can see the changes in

The PEP text is still at

Significant changes:

  • settled on keeping the allocators global but requiring that they all be thread-safe
  • the state of the existing “small block” will be moved to PyInterpreterState
  • dropped references to mimalloc
  • simplified the C-API changes
  • clarified the situation with incompatible extension modules
  • proposed that extensions always opt in to per-interpreter GIL support with a new PyModuleDef slot (at least until we have enough evidence that multi-phase init is sufficient)
  • expanded “How to Teach This”

For me the most critical things to settle are:

  • Are we okay to require that the “mem” and “object” allocators be thread-safe, whereas currently we say they can rely on the GIL?
  • Can we avoid making extensions opt in to supporting per-interpreter GIL (if they already implement multi-phase init)?

Open questions (from the PEP):

  • Are we okay to require “mem” and “object” allcoators to be thread-safe?
  • How would a per-interpreter tracemalloc module relate to global allocators?
  • Would the faulthandler module be limited to the main interpreter (like the signal module) or would we leak that global state between interpreters (protected by a granular lock)?
  • How likely is it that a module works under multiple interpreters (isolation) but doesn’t work under a per-interpreter GIL?
  • If it is likely enough, what can we do to help extension maintainers mitigate the problem and enjoy use under a per-intepreter GIL?
  • What would be a better (scarier-sounding) name for importlib.util.allow_all_extensions?

9 posts were split to a new topic: How to share module state among multiple instances of an extension module?