PEP 661: Sentinel Values

I’ve had time to think on this since I need to pickle sentinel objects within my own projects. I’ve been experimenting on the typing-extensions implementation.

I ended up with solutions to handle missing and anonymous sentinels, but now I feel like most of these solutions were overengineered and can be ignored. They handle anonymous sentinels (and sentinels becoming anonymous due to their original definitions going missing) by preserving their data instead of crashing:

Importing sentinels manually

First a function which fetches the obj at module_name.name. Uses importlib.import_module to get the module then operator.attrgetter to support a qualified name. This was as concise as I could get it.

def _import_sentinel(name: str, module_name: str, /) -> Any:
    """Return object `name` imported from `module_name`, otherwise return None."""
    try:
        module = importlib.import_module(module_name)
        return operator.attrgetter(name)(module)
    except ImportError:
        return None  # When module_name does not exist
    except AttributeError:
        return None  # When module_name.name does not exist

Ironically None is not the best return value here, but a plain object() sentinel could work or these exceptions can be handled inline. This also returns anything even if the object at module_name.name isn’t a Sentinel. I think this behavior could be beneficial because this allows for forward compatibility between third-party sentinel types.

Typical usage of sentinels involves the sentinel being defined once at import-time and then reusing that single defined object for all cases, but any unpicking happens at run-time so there needs to be a way to fetch an existing sentinels identity using a function. Using a registry works for this and the above _import_sentinel function also works, but these fulfill slightly different needs.

_import_sentinel fails if the object doesn’t yet exist including when the sentinel instance is being initially created. This means two Sentinel’s with the same name and module_name can have different identities. So _import_sentinel has problems on its own.

A registry always works for ensuring that a Sentinel with the same name and module_name have the same identity, but it does not ensure that the identity of the returned object is the same as the one at module_name.name. If module_name.name was pickled and then later replaced by a third-party sentinel then the identity of objects will be split in two: the run-time identity (created by pickle) and the import-time identity.

A registry also handles cases where the sentinel definition has gone missing or is anonymous.

So my suggestion with _import_sentinel is to add it to Sentinel.__new__ including the registry so that a sentinel of a different type at module_name.name can be registered as the identity.

Strict keyword to verify anonymous sentinels

Sentinel.__new__ can now return an object with a type other than Sentinel. This can cause multiple obvious problems. My solution to that is a strict keyword. This defaults to True and adds a runtime check to test that the object with the returned identity matches what would normally be returned from the parameters (including a theoretical bool= or repr=) given to Sentinel.

class UNSET(): ...

UNSET = Sentinel("UNSET", strict=True)  # TypeError: object <class> at module_name.UNSET is not a Sentinel type
assert Sentinel("UNSET", strict=False) is UNSET  # Okay, registers and returns UNSET class as the sentinel identity
Sentinel("UNSET", strict=True)  # TypeError because the registered object was not a Sentinel type and this is always checked when strict=True

MISSING = Sentinel("MISSING", bool=NotImplemented)
Sentinel("MISSING", bool=False, strict=True)  # TypeError: can not redefine 'bool' in existing Sentinel

Note that strict=True enforces the return of a Sentinel type but strict=False returns Any.

Pickling anonymous sentinels with parameters

With all of that established, here is my proposed reduce function so far. A custom unpickle function which takes the module/name of the sentinel as normal but also has an options dictionary which holds any given parameters for that sentinel.

class Sentinel:
    ...

    def __reduce__(self):
        """Record where this sentinel is defined and its current parameters."""
        options = {}
        return (
            _unpickle_sentinel,
            (
                self._name,
                self._module_name,
                options,
            ),
        )


def _unpickle_sentinel(
    name: str,
    module_name: str,
    options: dict[str, Any],
    /,
) -> Any:
    """Unpickle Sentinel at 'module_name.name'."""
    return Sentinel(name, module_name, strict=False)

Alternatively, throw most of that out and use pickle’s singleton functionality for __reduce__. This was mentioned a lot but no examples were given and the library documentation for pickle doesn’t explain how modules are handled so I had to look at the source code to be sure (then I also tested this). Pickle looks for a __module__ attribute from the instance to determine which module the object was defined in. It is extremely simple in practice:

class Sentinel:
    ...

    @property  # Or assign to self.__module__ directly
    def __module__(self) -> str:
        """Return the module this instance was defined for."""
        return self._module_name

    def __reduce__(self) -> str:
        """Reduce this instance to a singleton."""
        return self._name  # self.__module__ is used here

While this version doesn’t handle as many edge cases it is much simpler and will work for the typical use cases while still being forward compatible with any future methods of pickling or defining the object. Looking at them, I prefer this reduce method due to it behaving much more predictably compared to my overengineered alternative. A sentinel going missing has the same problems and solutions as any other pickled singleton going missing so it’s less of a issue because the workarounds for missing singletons are well known.

As long as the registry is kept then anonymous sentinels still work, but will raise pickle.PicklingError on any attempt to pickle them.


Another option is to drop support for anonymous sentinels which will simplify the current implementation dramatically:

class Sentinel:
    def __init__(self, name: str, module_name: str | None = None) -> None:
        self._name = name

        if module_name is None:
            module_name = sys._getframemodulename(1)
            if module_name is None:
                module_name = __name__

        self.__module__ = module_name

    def __repr__(self) -> str:
        return self._name

    def __reduce__(self) -> str:
        return self._name

But I don’t think the implementation is the issue with anonymous sentinels, it’s the syntax. It’s easy to not qualify the name of an anonymous sentinel in a scope and that could lead to name clashes or even clashes with existing top-level names within the module. There are run-time costs to catch only some of these mistakes.

1 Like