Hello community! While Python isn’t a fast language, that’s no reason to skip useful optimizations that benefit everyone’s workload. I recently decided to pursue an idea I’ve had for some time: dict constant folding. The basic idea is to enable the same tuple/list/set constant folding optimization for dictionaries. Instead of constructing them each time, use a LOAD_CONST
in safe contexts. To achieve this, I propose introducing a simple frozendict type.
Demo fork: GitHub - Zaczero/cpython at dictfold
TL;DR
>>> def foo(bar): return {'a': 1, 'b': 2}.get(bar)
...
>>> dis.dis(foo)
1 RESUME 0
LOAD_CONST 0 ('a')
LOAD_SMALL_INT 1
LOAD_CONST 1 ('b')
LOAD_SMALL_INT 2
BUILD_MAP 2
LOAD_ATTR 1 (get + NULL|self)
LOAD_FAST_BORROW 0 (bar)
CALL 1
RETURN_VALUE
becomes
>>> def foo(bar): return {'a': 1, 'b': 2}.get(bar)
...
>>> dis.dis(foo)
1 RESUME 0
LOAD_CONST 0 (frozendict({'a': 1, 'b': 2}))
LOAD_ATTR 1 (get + NULL|self)
LOAD_FAST_BORROW 0 (bar)
CALL 1
RETURN_VALUE
dict constant folding:
During bytecode generation/optimization, apply a technique similar to existing tuple/list/set constant folding. In safe, side-effect-free contexts with provably constant dictionaries, replace their build instructions with a single LOAD_CONST
. The following example is not constant and therefore not optimized:
>>> def foo(bar): return {'a': 1, 'b': bar}.get(bar)
...
>>> dis.dis(foo)
1 RESUME 0
LOAD_CONST 0 ('a')
LOAD_SMALL_INT 1
LOAD_CONST 1 ('b')
LOAD_FAST_BORROW 0 (bar)
BUILD_MAP 2
LOAD_ATTR 1 (get + NULL|self)
LOAD_FAST_BORROW 0 (bar)
CALL 1
RETURN_VALUE
frozendict:
Initially, I wanted this to work without introducing a new type. It quickly became apparent that one would be needed. The amount of patching and workarounds required to support unhashable types in LOAD_CONST
is simply too much, not to mention how fragile it would be against accidental mutations.
Beyond the optimization benefits, frozendict addresses a real gap in Python’s type system.
My draft looks like this:
class frozendict(object)
| frozendict() -> empty frozendict
| frozendict(mapping, /, **kwargs) -> immutable copy of mapping
| frozendict(iterable, /, **kwargs) -> from an iterable of (key, value) pairs
| frozendict(**kwargs) -> mapping built from keyword arguments
|
| Return an immutable mapping containing the supplied key/value pairs. Keys
| and values must be hashable; TypeError is raised otherwise. The result
| preserves insertion order, provides the standard read-only mapping API, and
| is itself hashable.
|
| Methods defined here:
|
| ... truncated magic methods ...
|
| get(...)
|
| items(self, /)
|
| keys(self, /)
|
| values(self, /)
Like frozenset, frozendict can only be initialized and cannot be modified. It’s always hashable, which means all its keys and values must be hashable too. It’s conditionally hashable, based on whether all of its values are hashable (just like tuple). It supports the same constructors as dict and exposes the read-only methods of dict.
Later, it should include update utilities to make code like frozendict | dict
possible. Maybe an .update(...)
method returning an updated copy would be nice. For now, I didn’t bother.
I propose that frozendict should be a new builtin type. While I understand adding builtins is discouraged, in this case it seems like the right choice.
- Having a frozendict will logically complete this builtin pattern:
Type | Mutable | Immutable |
---|---|---|
Sequence | list |
tuple |
Set | set |
frozenset |
Mapping | dict |
frozendict |
- This type will also appear in .pyc files. It feels wrong for these to contain types that require importing additional modules. It will be marshalable under the
}
key, bumping the marshal version 5→6:
Type | Mutable | Immutable |
---|---|---|
Sequence | [ |
( ) |
Set | < |
> |
Mapping | { |
} |
Ref: cpython/Python/marshal.c at dictfold · Zaczero/cpython · GitHub