TLDR
Proposal for a parallel-read-access dictionary and potentially other data structures to complement free-threaded Python.
Background
Together with @Rostan , we have recently been experimenting with free-threaded Python. One of the exercises we conducted involved engaging parallel threads in data processing using NLTK (tokenization and lemmatization) and Pillow (image decoding). Interestingly, we did not achieve the performance gains we expected.
Our investigation showed that both of these extensions (and we believe this is common practice) maintain the global state in a dictionary. As a result, when functions run in parallel, the global-state-dictionaries are accessed in parallel using atomics, which slows down the entire process. Even though atomics were supposed to maintain good performance, they are still slow under contention.
Idea
The conclusion above led us to an idea: along with removing the GIL from Python, we believe that a mechanism allowing data structures to be read in parallel faster might be useful.
Importantly, our idea goes beyond simple immutability - making an object immutable is possible from the Python frontend. However, to achieve parallel reading, we need to modify the backend API.
So far, we have identified a handful of previously proposed solutions that address this problem to some extent:
- PEP 603 (
frozenmap
type) - Draft, - PEP 416 (
frozendict
builtin type) - Rejected, - PEP 351 (The freeze protocol) - Rejected,
- python-frozendict,
- immutabledict,
- MappingProxyType.
As mentioned earlier, existing solutions on PyPI focus on introducing immutable data types (e.g. immutabledict
) on the Python level. This would likely be insufficient, as they use the underlying implementation of dict, whose C-level API does not allow for parallel reading. Furthermore, PEPs 416 and 351 were rejected in the past for good reasons at the time. However, since work on free-threading is ongoing, the situation has changed, and perhaps now would be a good moment to revisit these discussions. Lastly, PEP 603 proposes a frozenmap
datatype, which actually aligns with our idea. The downside to the proposed approach is that any modification of a frozenmap
requires copying the entire object.
Beyond the idea of frozenmap
presented in PEP 603, we are also considering other (bolder) solutions, which are not restricted to dictionaries. The inspiration for those is PEP 351. Unlike frozenmap
, the two proposed approaches do not involve a copy of the whole object within the action of modifying it. The ideas are:
- Introducing an
obj.un/freeze()
[1] methods for built-in data types. Upon calling the method, the interpreter will lock the data type object for modification and allow parallel read access. - Introducing a dunder method
__un/freeze__
, which could correspond to a keyword likefreeze obj
(similarly todel obj
).
We believe that these approaches fit well in the scenarios like described at the top (i.e. keeping a global state in a dictionary - or any other data type). In such cases it’s natural to have an initialization phase first, then freezing the data and reading from it. Contrary to overwriting and reassigning a global variable.
How do you feel about the idea of having a parallel-read-access dictionary? Would it be a good idea to expand this to other (perhaps all) built-in data types? Or do you think we should explore an entirely different approach to solving parallel access issues?
We acknowledge that there is some controversy around calling this operation
freeze
because—depending on the implementation—it may actually involve copying. Since our proposal involves no copy, we are keeping this name. Naturally, we are not tightly attached to it and the final name would be decided at a later stage. ↩︎