The SC has decided to officially defer PEP 661. We’re happy to reconsider it for Python 3.15.
In Pydantic, we’d like to introduce an UNSET sentinel (one of the most requested features), and we would really like to make use of stdlib sentinels. In particular, not having to introduce an UnsetType and being able to use UNSET both as a type expression and a value is a convenient feature.
As this was deferred for 3.15, I wanted to know if the implementation would also be ported to typing_extensions as sentinels have a special meaning in typing? There’s already a precedent with @warnings.deprecated. If so, it potentially means we won’t have to wait until 3.15 to make use of it [1].
Else, we’ll have to provide a Pydantic
UNSETsentinel as an experimental feature as theUnsetTypewon’t be necessary with this PEP. ↩︎
Yes, I think it’d make sense to add a backport in typing-extensions. Would be good to first make sure the API isn’t going to change too much though, so we don’t run into compatibility problems later if we need to change the implementation in typing-extensions.
Thanks Barry (and other SC members). When I can find the time to follow this up I’ll schedule some meeting hours time to discuss further.
I have nothing useful to offer on the specifics, but I would very much like to see this or something very similar adopted.
What I will add to this discussion is that typing.Optional is discouraged, and pylance can mark it as deprecated if deprecateTypingAliases is set. Unfortunately the repository and issue that I filed no longer exists, but the response was roughly that Optional would have been deprecated by the typing community if it hadn’t been so widely used. If Optional really is to be discouraged, it would be nice to have a better way to cover its use in a sentilelish way.
I mention this because I, and I suspect others, have been using Optional in a specific way because we weren’t aware of sentinels.
I have been typing.Optional in a sentinel-like way just to indicate intent to the human reader of the code. That is, even though Optional[T] is equivalent to T | None, the distinction tells me that sorts of meanings to ascribe to 'None`.
Another place I use None where a more specific sentinel would be useful is for values that have not yet been computed. In many of those cases I could (and perhaps should) be using @cached_property, but even so, I found myself wanting sentinels before I was aware of the concept explicitly.
After the inclusion of sentinels in typing-extensions (with sentinel instances being unpicklable until the PEP decides on the correct implementation), I started playing around with an unset sentinel in Pydantic (PR), and tried figuring out what would be the best pickling behavior:
# In `unset.py`:
UNSET = Sentinel('UNSET')
# In `main.py`:
from unset import UNSET
class Model(BaseModel):
f: int | UNSET = UNSET
m = Model()
Ideally, when pickling m, m.f should be the same unset.UNSET instance, so that special-casing of the sentinel (e.g. exclusion of the f field on serialization) can be performed using identity comparison, e.g. with the following pseudo-code:
m_dict = {}
for k, v in m.__dict__.items():
if v is unset.UNSET:
continue
m_dict[k] = v
The registry approach would fulfill this use case: when unpickling UNSET, the Sentinel constructor would return the already existing unset.UNSET instance.
Alternatively, the UNSET sentinel could be defined as an instance of an UnsetType class (as msgspec does currently), but we wouldn’t be able to leverage the possibility to use the sentinel instance as a type annotation.
For the pickling behaviour, there is another method already implemented which could work. If __reduce__ returns a string name, it’ll be combined with the __module__ attribute to get a fully-qualified name. Then unpickling will import and fetch that. No registry required, you’ll get an appropriate error if pickling non-global sentinels, but it will need the module name provided/inferred.
I added experimental type checking support for PEP 661 to pyright 1.1.402.
Code sample in pyright playground
from typing_extensions import Sentinel
MISSING = Sentinel("MISSING")
def func1(value: int | MISSING) -> None:
if value is MISSING:
reveal_type(value) # MISSING
else:
reveal_type(value) # int
If you want to try this feature, you’ll need to set enableExperimentalFeatures to true in your pyright configuration.
I’ve had time to think on this since I need to pickle sentinel objects within my own projects. I’ve been experimenting on the typing-extensions implementation.
I ended up with solutions to handle missing and anonymous sentinels, but now I feel like most of these solutions were overengineered and can be ignored. They handle anonymous sentinels (and sentinels becoming anonymous due to their original definitions going missing) by preserving their data instead of crashing:
Importing sentinels manually
First a function which fetches the obj at module_name.name. Uses importlib.import_module to get the module then operator.attrgetter to support a qualified name. This was as concise as I could get it.
def _import_sentinel(name: str, module_name: str, /) -> Any:
"""Return object `name` imported from `module_name`, otherwise return None."""
try:
module = importlib.import_module(module_name)
return operator.attrgetter(name)(module)
except ImportError:
return None # When module_name does not exist
except AttributeError:
return None # When module_name.name does not exist
Ironically None is not the best return value here, but a plain object() sentinel could work or these exceptions can be handled inline. This also returns anything even if the object at module_name.name isn’t a Sentinel. I think this behavior could be beneficial because this allows for forward compatibility between third-party sentinel types.
Typical usage of sentinels involves the sentinel being defined once at import-time and then reusing that single defined object for all cases, but any unpicking happens at run-time so there needs to be a way to fetch an existing sentinels identity using a function. Using a registry works for this and the above _import_sentinel function also works, but these fulfill slightly different needs.
_import_sentinel fails if the object doesn’t yet exist including when the sentinel instance is being initially created. This means two Sentinel’s with the same name and module_name can have different identities. So _import_sentinel has problems on its own.
A registry always works for ensuring that a Sentinel with the same name and module_name have the same identity, but it does not ensure that the identity of the returned object is the same as the one at module_name.name. If module_name.name was pickled and then later replaced by a third-party sentinel then the identity of objects will be split in two: the run-time identity (created by pickle) and the import-time identity.
A registry also handles cases where the sentinel definition has gone missing or is anonymous.
So my suggestion with _import_sentinel is to add it to Sentinel.__new__ including the registry so that a sentinel of a different type at module_name.name can be registered as the identity.
Strict keyword to verify anonymous sentinels
Sentinel.__new__ can now return an object with a type other than Sentinel. This can cause multiple obvious problems. My solution to that is a strict keyword. This defaults to True and adds a runtime check to test that the object with the returned identity matches what would normally be returned from the parameters (including a theoretical bool= or repr=) given to Sentinel.
class UNSET(): ...
UNSET = Sentinel("UNSET", strict=True) # TypeError: object <class> at module_name.UNSET is not a Sentinel type
assert Sentinel("UNSET", strict=False) is UNSET # Okay, registers and returns UNSET class as the sentinel identity
Sentinel("UNSET", strict=True) # TypeError because the registered object was not a Sentinel type and this is always checked when strict=True
MISSING = Sentinel("MISSING", bool=NotImplemented)
Sentinel("MISSING", bool=False, strict=True) # TypeError: can not redefine 'bool' in existing Sentinel
Note that strict=True enforces the return of a Sentinel type but strict=False returns Any.
Pickling anonymous sentinels with parameters
With all of that established, here is my proposed reduce function so far. A custom unpickle function which takes the module/name of the sentinel as normal but also has an options dictionary which holds any given parameters for that sentinel.
class Sentinel:
...
def __reduce__(self):
"""Record where this sentinel is defined and its current parameters."""
options = {}
return (
_unpickle_sentinel,
(
self._name,
self._module_name,
options,
),
)
def _unpickle_sentinel(
name: str,
module_name: str,
options: dict[str, Any],
/,
) -> Any:
"""Unpickle Sentinel at 'module_name.name'."""
return Sentinel(name, module_name, strict=False)
Alternatively, throw most of that out and use pickle’s singleton functionality for __reduce__. This was mentioned a lot but no examples were given and the library documentation for pickle doesn’t explain how modules are handled so I had to look at the source code to be sure (then I also tested this). Pickle looks for a __module__ attribute from the instance to determine which module the object was defined in. It is extremely simple in practice:
class Sentinel:
...
@property # Or assign to self.__module__ directly
def __module__(self) -> str:
"""Return the module this instance was defined for."""
return self._module_name
def __reduce__(self) -> str:
"""Reduce this instance to a singleton."""
return self._name # self.__module__ is used here
While this version doesn’t handle as many edge cases it is much simpler and will work for the typical use cases while still being forward compatible with any future methods of pickling or defining the object. Looking at them, I prefer this reduce method due to it behaving much more predictably compared to my overengineered alternative. A sentinel going missing has the same problems and solutions as any other pickled singleton going missing so it’s less of a issue because the workarounds for missing singletons are well known.
As long as the registry is kept then anonymous sentinels still work, but will raise pickle.PicklingError on any attempt to pickle them.
Another option is to drop support for anonymous sentinels which will simplify the current implementation dramatically:
class Sentinel:
def __init__(self, name: str, module_name: str | None = None) -> None:
self._name = name
if module_name is None:
module_name = sys._getframemodulename(1)
if module_name is None:
module_name = __name__
self.__module__ = module_name
def __repr__(self) -> str:
return self._name
def __reduce__(self) -> str:
return self._name
But I don’t think the implementation is the issue with anonymous sentinels, it’s the syntax. It’s easy to not qualify the name of an anonymous sentinel in a scope and that could lead to name clashes or even clashes with existing top-level names within the module. There are run-time costs to catch only some of these mistakes.
Here’s my implementation of customizable truthiness. Like before this really just reveals to me that customizable truthiness as a feature might be too overly engineered and complex:
Sentinel generic boolean return types
The Sentinel class can have Generic with a TypeVar holding the return type for __bool__ with the type being one of Literal[True] | Literal[False] | Never. Overloads determine which return type is assigned.
One thing I’m worried about is if the use of Generic here would interfere with type-hinting internally. Otherwise this supports type-hinted conversions to boolean literals without special casing.
from types import NotImplementedType
from typing import Any, Generic, Literal, Never, TypeVar, overload
_BoolReturnType = TypeVar("_BoolReturnType")
class Sentinel(Generic[_BoolReturnType]):
@overload
def __new__(cls, *, bool: NotImplementedType) -> Sentinel[Never]: ...
@overload
def __new__(cls, *, bool: Literal[True]) -> Sentinel[Literal[True]]: ...
@overload
def __new__(cls, *, bool: Literal[False]) -> Sentinel[Literal[False]]: ...
def __new__(cls, *, bool: bool | NotImplementedType) -> Sentinel[Any]:
self = super().__new__(cls)
self._bool = bool
return self
def __bool__(self: Sentinel[_BoolReturnType]) -> _BoolReturnType:
if self._bool is NotImplemented:
raise TypeError
return self._bool
I’ve used a bool keyword-only parameter over bool_value, truthiness, is_true, and others. There’s little ambiguity when the options are bool=True, bool=False, and bool=NotImplemented passed to a Sentinel constructor. It’s also less to type out.
I use bool=NotImplemented as the value which raises an error since it’s more explicit than bool=None which might imply that a default value is being used or that None’s truthiness is used. bool=NotImplemented sets the __bool__ return type to Never and raises an error when truthiness is tested.
The main issue with customizable truthiness is that there are no good examples of using it. Any example I’ve seen or tried to make myself was always too forced.
Sentinel types are not in a vacuum, they are mixed with other types and that will determine what a truth test is actually checking for:
- Sentinel mixed with native sentinels:
None,False,True,Ellipsis,NotImplemented - Sentinel mixed with number types:
int,float - Sentinel mixed with various containers:
str,list,tuple - Sentinel mixed with custom classes with default truth behavior (this case is sometimes abused by truth-testing
None) - Sentinel mixed with objects with unknown behavior
Many examples of sentinels have falsy connotations but I can’t imagine a truth test being valid for the examples above unless it is being passed to third-party code that is already abusing truthiness (and thus would break with simple cases of passing number or container types as some have already stated). Most of the time one would use a predicate function which will handle expected sentinels in situations where truthiness is theoretically needed.
At this point I’m convinced that the optimal solution is to treat conversions to bool as an accident to be avoided, enforced with the following dunder method:
def __bool__(self) -> Never:
raise TypeError
I feel a bit conflicted - I understand the argument to have sentinels be not-convertible-to-bool, but at the same time, I don’t really like the solution of forcing this to be the case.
Every value in Python is an instance of object, and one would usually expect to be able to convert any object to a bool without suddenly getting a TypeError.
If sentinels are not bool-convertible, should the following give a linter / typing error?
def foo(x: object):
if x:
bar()
Intuitively, I’d really expect that any value can appear after if or while.
I had a read of your “Sentinel generic boolean return types” dropdown, and I actually like doing it this way. Let whoever defines the Sentinel define its bool-conversion behavior, and include the ability to make it raise TypeError - where explicitly requested - since after all, putting in a user-defined __bool__ that raises TypeError is entirely valid too, if rare.
I have had use cases for sentinels that should be not-convertible-to-bool, so would definitely use this. I think I’ve had use cases for the other two possibilities as well.
Still, I’d support anything that gets the concept of sentinels into Python - preferably with user-definable bool-conversion behavior, but if it has to be fixed as always True, always False, or always TypeError, it’d still be absolutely worth it.
This is false in numerical libraries. Numpy/tensorflow/etc commonly do give an error if you try this as meaning of bool(array) for non-scalar array is ambiguous.
Ignoring that issue I also often find that kind of code unclear. Sometimes empty string is what we want to be false. Sometimes it’s not. If I have x: str | None when I’m checking it do I care about null vs emptiness? Same applies for empty list vs no list. If the type is specifically x: str then yes if x is clear in intent.
I don’t have strong view on Sentinel choice. Mostly just TypeError can be a fine option for some types and does appear for some widely used objects.
Maybe yes? Really look at this code, what is it doing? Why is it doing this? This example is suspect even without theoretical sentinels involved.
This function can already give errors at runtime:
foo(NotImplemented) # DeprecationWarning
foo(numpy.zeros(10)) # ValueError
It’s also a forced example which only exists to demonstrate truth tests. It’s unrealistic to check the truth values of a plain object. A more realistic example might include a union with None and it’s well known that using truthiness to separate objects from None is a bad practice, and so the end result inevitably becomes this:
def foo(x: object | None):
if x is not None:
bar()
foo(NotImplemented) # safe
foo(numpy.zeros(10)) # safe
This ends up being repeated for every example I’ve seen or tried to come up with. Identity comparison is always the correct choice unless the sentinel is being forced into code which already abuses truthiness. This is my motivation for suggesting raising TypeError by default.
I still think the __bool__ method for sentinels should raise an error. This is the only option that would allow changing __bool__ in a backward-compatible manner in the future if real-world usage demands it after it has seen widespread adoption.
@taleinat it seems this PEP is stalled again. I’d like to offer to take over pushing the PEP to completion to make sure we get it in 3.15. My plan would be to implement the recommendations from the SC in PEP 661: Sentinel Values - #234 by barry (e.g., removing the per-module registry, making it a builtin) and then resubmitting to the SC. Does that sound good?
Do we want to do anything specific about pickling support? Here is a vanilla Python example:
UNSET = Sentinel('UNSET')
@dataclass
class A:
f: int | UNSET = UNSET
a = A()
assert pickle.loads(pickle.dumps(a)).f is UNSET
My opinion is that we can probably live without it, sentinels have other uses that don’t involve using them as instance attributes, but I just wanted to confirm this point.
The least surprising behavior would be for sentinels to use pickle’s singleton support. This would support your example with minimal implementation.
Sentinels can also end up as dictionary keys or in collections, not just as names or attributes:
SPECIAL = Sentinel("SPECIAL")
example_dict = {SPECIAL: ...} # Unique key which won't collide with any existing key
SEPARATOR = Sentinel("SEPARATOR")
example_list = [1, 2, SEPARATOR, 3, 4] # Unique value which won't collide with any existing value
Pickle’s singleton support handles all cases in a predictable manner.
I haven’t been tracking the PEP closely, but checking back in and seeing recent comments, I feel that the case for allowing __bool__ to be set is getting forgotten.
I have a number of libraries which already define custom sentinels which define __bool__. Whether or not this is advisable is sort of besides the point. It’s part of the public interface for these values.
If sentinels don’t support setting bool to false, I don’t think it’s possible to swap out the implementations without doing major releases. The common case I have is wanting a falsy nullary value other than None because None, already being part of the language, is or may be assigned some other meaning.
I find code like the following pretty commonplace
def foo(x: MyClass | None | NULL):
if x:
return x.y
return x
Although the string representations can be treated as implementation details, bool is part of the expected interface for users to be using.
It’s not unreasonable, IMO, to say that the cost I’m citing here is small, and will only be felt by a small population of library authors. But when I see folks say that they can’t come up with use cases for this, I figure that just means we work in different domains.
Assuming you meant returning a string from __reduce__(), it has to be a bit more involved than that. This only works if the sentinel instance is defined in the same module where the class is. See this example for instance.
Imo it is reasonable to add support pickling for sentinels if the following conditions are respected:
- The sentinel name is the same as the assigned variable (e.g.
UNSET = Sentinel('UNSET')). - The sentinel is defined at the module level.
- The Python implementation provides a way to get the previous frame of the call stack (e.g. through
sys._getframe())
Here is a sketch implementation without proper exception handling (can also be expanded, e.g. have the ability to provide a custom module name):
def _unpickle_sentinel(module_name, name):
mod = importlib.import_module(module_name)
return getattr(mod, name)
class Sentinel:
def __init__(self, name):
self._name = name
self._module_name = caller(default=None)
...
def __reduce__(self):
return (_unpickle_sentinel, (self._module_name, self._name))
This is because you didn’t assign to __module__. The official documentation is poor when it comes to this feature, but I eventually figured it out. This is the singleton version of your example:
class Sentinel:
def __init__(self, name):
self._name = name
self.__module__ = caller(default=None)
def __reduce__(self):
return self._name
Requiring the sentinel to be defined at the module level is reasonable trade-off. Sentinels can still be defined anywhere, but they must be a singleton to be pickled. This also supports ClassVar sentinels which would raise on your examples usage of getattr.
It’d follow the same rules for class singletons. If you can’t pickle a class from there, then you can’t pickle a sentinel from there either.