Pre-PEP: Safe Parallel Python

Python already has a memory model, not in the language reference where you’d expect to find it, but in the Library and Extension FAQ:

What kinds of global value mutation are thread-safe?

[…] In practice, it means that operations on shared variables of built-in data types (ints, lists, dicts, etc) that “look atomic” really are.

For example, the following operations are all atomic (L, L1, L2 are lists, D, D1, D2 are dicts, x, y are objects, i, j are ints):

L.append(x)
L1.extend(L2)
x = L[i]
x = L.pop()
L1[i:j] = L2
L.sort()
x = y
x.field = y
D[x] = y
D1.update(D2)
D.keys()

…followed by a warning about __del__ re-entrancy (but not more common sources of re-entrancy like comparisons) and the exhortation “When in doubt, use a mutex!” (but with no mention of deadlocks).

While not documented, CPython issue bpo-13521 / gh-57730 made dict.setdefault atomic because it “was intended to be atomic” and deployed code depended on it.[1]

PEP 703 (free-threading) contains statements like

Still, the GIL ensures that some operations are effectively atomic. For example, the constructor list(set) atomically copies the items of the set to a new list, and some code relies on that copy being atomic (i.e., having a snapshot of the items in the set). This PEP preserves that property.

which is also not documented, but is a similar “built-in types, looks atomic, is atomic” operation. Unfortunately PEP 703 makes handwavy statements like “per-object locking aims for similar protections as the GIL” instead of listing, even incompletely, the protections it aims to provide, though the thorough implementation details do implicitly list some of them.

I have three points to make from all this:

  • Python already makes some guarantees that deployed code relies upon, but there isn’t a single complete list of them.
  • Python either must uphold these guarantees, or ought to explicitly repudiate them (so code relying on them can be fixed).
  • Given that Python provides these guarantees, if most user code can live with shared-implies-immutable, Python may not need special concurrent data structures similar to Java’s.

For completeness, I feel I should mention PEP 583 A Concurrency Memory Model for Python from 2008. It surveyed the options as understood at the time rather than making a concrete proposal. One notable option is unconditional[2] sequential consistency, which most languages would reject out-of-hand for performance reasons.

PEP 583 contains one occurrence of the word “mutable” and zero instances of the word “immutable”, so it doesn’t have much bearing on the current proposal.


Personally, I think a memory model is essential, because without one it’s impossible to determine whether code is correct or not.[3] Python has a high abstraction level, so its memory model can be much simpler than Java’s or C++'s. Python is very dynamic, so the model can[4] be enforced at runtime rather than relying on documentation, linters and sanitizers. But that we already have one, even with the GIL, shows that we can’t really get away without one.


While I hope Python doesn’t have to get down in the details, if you’re curious about them, I enjoyed Russ Cox’s articles on hardware memory models (sequential consistency, x86, ARM, DRF-SC) and programming language memory models (Java, C++, and JavaScript).


  1. And I’ve since written code that depends on it, so please don’t break it. :slight_smile: ↩︎

  2. not the “… for data-race free programs” often found in hardware memory models ↩︎

  3. and I expect PEP 703 to fail in practice because it does not attempt to define one ↩︎

  4. I hope, and this pre-PEP seems to propose to ↩︎

6 Likes