PEP 814: Add frozendict built-in type

hello -

I’d like to chime in with support for this proposal, and I invite everyone here to look at SQLAlchemy’s current frozendict implementation which can be seen in its cythonized version at
https://github.com/sqlalchemy/sqlalchemy/blob/83ded020c92609c3dd228abdc4d8db96d7a7e915/lib/sqlalchemy/util/_immutabledict_cy.py#L109
where we call it “immutabledict”. SQLAlchemy uses “immutabledict” all over the place to represent default sets of options for many functions and classes. It of course provides a great solution to the “dictionary as default kw argument” problem but we also have lots of cool constructor patterns we use to get immutabledicts built up from plain mappings (like a merge_with() method that receives any number of mappings and merges them in), here’s an example ( I would link them however there seems to be a limit on links):

    if execution_options:
        execution_options = util.EMPTY_DICT.merge_with(
            execution_options,
            {
                "_sa_orm_load_options": load_options,
            },
        )
    else:
        execution_options = {
            "_sa_orm_load_options": load_options,
        }

in particular the pattern we use having a global EMPTY_DICT that we can use as a default “empty mapping” default argument anywhere we want, then we can build it up with union / | / merge_with() is one of my favorite patterns in the whole codebase. I’ve paraphrased a common pattern we use in the example below, a class we have where we have a class level default empty dictionary; instances of the class are used as elements in a larger composition with other objects that also use the same dictionary attribute, but they support an instance level version of it. By building on “immutabledict” and using “union” or “merge_with” as needed, we achieve a huge savings on how many new dictionaries we need to create in practice, which has significant performance / memory savings at scale. Issues where these shared dictionaries leak details across different objects are nonexistent and this pattern is definitely one of my favorite coding patterns in SQLAlchemy. CompilerColumnElement is some atomic unit within a larger composition:

class CompilerColumnElement:

    __slots__ = ()

    _propagate_attrs = util.EMPTY_DICT

    ...

note the slots above. it means we can’t set _propagate_attrs at the instance level. How is this useful? Well CompilerColumnElement is only a building block inside of other kinds of objects, lots of which don’t use slots, so as we build up objects they do things that are effectively equivalent to this:

class SomeSQLThing:
    def __init__(self, *elements):
        self._propagate_attrs = util.EMPTY_DICT.merge_with(*[e._propagate_attrs for e in elements])    

So every CompilerColumnElement has that mapping already set up on it for free, no __init__ or new dict every time needed, then on other classes that need to do more with the attributes, it can use union / merge to build up new dictionaries. The merge_with() instruction is optimized to not waste new dictionaries if not needed:

>>> util.EMPTY_DICT.merge_with({}, {}) is util.EMPTY_DICT
True

So another class that needs to be able to populate its _propagate_attrs would do something like this:

class SomeOtherSQLThing:
    def __init__(self):
        # no dictionary construction time/memory overhead!
        self._propagate_attrs = util.EMPTY_DICT

    def add_option(self, new_token):
        # only if we need it!
        self._propagate_attrs = self._propagate_attrs | {"new_token": new_token}

Here’s a demo of the above three classes:

>>> cce = CompilerColumnElement()
>>> other_sql_thing = SomeOtherSQLThing()
>>> other_sql_thing._propagate_attrs
immutabledict({})
>>> other_sql_thing.add_option("some option")
>>> other_sql_thing._propagate_attrs
immutabledict({'new_token': 'some option'})
>>> some_sql_thing = SomeSQLThing(cce, other_sql_thing)
>>> some_sql_thing._propagate_attrs
immutabledict({'new_token': 'some option'})

So above we have three different objects, all of which refer to a local “propagate_attrs” dictionary. Suppose the above code happens 10000 times. In that case, the above code uses exactly 20001 “real” dictionaries; the global util.EMPTY_DICT dictionary, and then the 10000 dictionaries we created in each “add_option” call, then the merge_with() created a new dictionary because it received one that was populated (merge_with() could be further improved so that if only one of its dictionaries is non empty, it returns just that non-empty immutable dictionary, so that would save another 10K dictionaries). That is compared to the 30000 dictionaries that would be needed at the least to do the equivalent pattern with plain mutable dictionaries (noting it would be a non-starter to have CompilerColumnElement’s class level dict be mutable, and without having to add complex logic to have _propagate_attrs be None by default and provide a facade around that).

But that example assumes we called add_option() - this mutation of options case is actually for us relatively infrequent. So if add_options() was not called at all, then 10000 calls creating those three objects would still only require exactly one real dictionary, rather than the 30000 that would be needed to accomplish the above patterns with plain mutable dictionaries. Even if we say “well you could just have _propagate_attrs be None if you didnt need it” and wanted to deal with the extra complexity, the immutabledict pattern also works if some of the classes have fixed options at the class level like this:

class IHaveOptions:
    _propagate_attrs = util.immutabledict({"fixed_option": True})

being able to create these immutabledicts everywhere and never having to worry that incorrect code is going to cause leakages of data between classes / instances is a gigantic win. There’s no need for code to be careful about it, it just works.

so anyway, if “SQLAlchemy likes the idea and already uses it heavily” is an effective endorsement, then great! Otherwise, if “SQLAlchemy likes the idea and already uses it heavily” is a glaring red flag “oh no those SQLAlchemy goofs use this, forget it”, I apologize ! :slight_smile:

9 Likes