PEP 661: Sentinel Values

srittau · June 11, 2021, 10:49am

I like Paul’s suggestion (opt: int = Sentinel) for the same reason I liked opt: int = None: conciseness without sacrificing readability. That said, as Petr points out, opt: int = None was removed from PEP 484 and I don’t think we should be inconsistent between these two cases.

BlckKnght · June 11, 2021, 8:21pm

Perhaps Optional should be extended (in the type checker’s logic, not sure any runtime changes are necessary) to allow sentinel types other than None?

It seems like the use case for sentinels is the same as for Optional for the most part, you just need a unique sentinel if None might be a valid part of the user data. So you don’t need a special sentinel for opt: int = Sentinel (since you can use opt: Optional[int] = None there safely), but you do for opt: T = Sentinel because T might be some collection of types that includes None (e.g. Optional[int]).

Maybe the specific sentinel type could even be inferred from the default argument? I’m not sure about that.

ors · August 31, 2021, 9:04pm

I think __bool__ is also important. Sentinels often represent “empty” values that are convenient to handle as falsy, e.g.
def f(value: List | None | Unknown): return value or []. Something like Unknown = sentinel("Unknown", bool=False) would be very useful and convenient.

Use case example

NoAnswer = sentinel("NoAnswer", False)

@dataclass
class ClientSurveyResponse:
    interesting_items: List[Item] | None | NoAnswer = NoAnswer
    # None = no particular preference, [] = interested in nothing
    ...
    
    def items_proposable_to_client(self):
        return [item for item in self.interesting_items or [] if item in inventory]

The alternative is type(Unknown).__bool__ = lambda self: False, but it appears to me that “empty” sentinels are extremely common – the overwhelming majority even, based on this list. Perhaps it should be the default.

P.S. how about the auto-naming syntax from the PyPI’s sentinel package? Something like

MySentinel = sentinel.create()
print(MySentinel)  # prints `MySentinel`

taleinat · September 29, 2021, 8:18pm

After a great delay, I’ve decided to go with using Literal[NotGiven] for type annotations of sentinel values.

Thanks for everyone taking part in the discussion of type annotations for sentinels, here and on the typing-sig mailing list!

taleinat · September 29, 2021, 8:20pm

That’s a great point, thanks for bringing it up!

I’m not sure complicating the interface is necessary here. How about sentinel values always being “falsey”?

(Though, note that bool(Ellipsis) returns True!)

taleinat · September 29, 2021, 8:27pm

A very late followup to this: Note that early on I did consider using class objects for sentinel values, since those too have “very “singleton” semantics in python”. However, I consider sentinel values being simple objects to be much more valuable, adhering to the design principle of least surprise, and reducing the chance for unexpected behavior.

AlexWaygood · September 29, 2021, 9:43pm

Sentinel values always being falsey makes a great deal of sense to me: +1. bool(Ellipsis) returning True feels to me like yet another reason why none of the existing solutions is really adequate, rather than a case against this.

pbryan · September 29, 2021, 10:07pm

It looks like a new sentinels module is being proposed. Any reason not to include it in typing instead?

sirosen · September 30, 2021, 1:03am

I think that the default __bool__ of False is important and worth adding to the PEP. I came to this thread just to ask about it myself!

I think one of the more common cases for sentinels is trying to distinguish missing data from explicit null data. That’s the case with, for example, marshmallow.missing.

steven.daprano · September 30, 2021, 2:55am

Not to me. I’ve written dozens of sentinel values over the years, and
I’ve hardly ever specified that they be falsey.

For the most recent sentinel that I just wrote, I didn’t even think
about whether it should be truthy or falsey. But having thought about it
now, I definitely want it to be truthy. Fortunately the default
implementation of __bool__ behaves sanely.

In the stdlib, obviously None is falsey, but NotImplemented used to be
truthy, and now raises in a boolean context (or at least, soon will
raise in a boolean context – I forget when the warning will become an
error); Ellipsis is truthy; object() is truthy.

str.find() returns -1 as a sentinel, which is truthy; the re module
functions return None as a sentinel.

taleinat · September 30, 2021, 5:26am

Could you explain why you want it to be truthy? Perhaps an example would help.

steven.daprano · September 30, 2021, 5:57am

Instances have everything that classes have (by inheritence from the

class), plus their own per-instance state. At worst, any state or

behaviour held by the class that isn’t accessible from the instance is

just a call to type(instance) away. So instances can not be more

simple than their class.

Ultimately, the amount of complexity of a class alone cannot be higher

than that of the same class plus an instance.

As far as memory footprint goes, classes are bigger than instances, but

only if you forget that instances delegate most of their “stuff” to the

class. It’s still there, it still exists, its just bundled in the class

rather than the instance. If you have many instances, that’s a win, but

if you have a singleton, it costs more space to have a class+singleton

than just a class.

The only advantage I can see of having a class + instance rather than

just a class is that it avoids the interminable arguments "But why isn’t

there an instance?" from who aren’t familiar or comfortable with classes

as first-class citizens (pun not intended)

To be concrete, if we look at the dir() of a class object with no

special metaclass:

class C:

    pass



print(dir(C))

which specific things do you think shouldn’t be available on a sentinel

and would need to be removed if we were use the class rather than an

instance?

taleinat · September 30, 2021, 6:04am

For one thing, I think sentinel values shouldn’t be callable, which class objects inherently are (despite "__call__" not being returned by dir()):

>>> class C:
...     pass
>>> C()
<__main__.C object at 0x7f7fcbe2ae80>
>>> callable(C)
True

Even if one were to clobber its __init__ or __new__ to make instantiation fail, the machinery would still be there, and callable(C) would still return True.

EpicWink · October 1, 2021, 12:30am

I think the implementation should allow truthiness to be overridden, to keep the truthiness of the existing sentinels in the standard library. However, I think the default should be falsy.

A sentinel to me means a value was not specified, even via a default. This to me reads like an empty collection, or None (perhaps similar to JavaScript’s undefined)

When a user is using the sentinel unknowingly in an if-statement (a good example is in a template, eg Jinja), I would think that the if-statement body be skipped as no value was provided

steven.daprano · October 1, 2021, 5:29am

The way I test for the sentinel is with an identity test:

if obj is MySentinel: ...

so I don’t often care about its trueness. In the most recent example
where I do care, the sentinel is part of the public API, not just an
internal detail. Making the sentinel falsey would seemingly permit this
anti-pattern:

obj = func(*args)  # Call my API.
if obj:
    process(obj)
else:
    # Handle the sentinel.

but that would be wrong, because the sentinel is not the only falsey
object my API might return. The correct test is the identity test, not a
bool test.

Of course if the sentinel was falsey, I could document this trap and say
“Then just don’t do that!”. But why would I want to make the sentinel
falsey unless I wanted to use the bool test instead of the identity
test?

In the case of regular expressions, it’s okay to write:

mo = re.match(...)
if mo:
    ...

because the match function is guaranteed to return either a match object
which is always truthy or None which is falsey. So if your API is like
re.match, maybe having your sentinel be falsey is fine. But in my API,
its not like that, and allowing the sentinel to be falsey would
encourage unsafe short cuts and give no benefit at all.

steven.daprano · October 1, 2021, 5:36am

Personally, I think that we should default to the safe alternative,
which is to use the same default as object(): everything is truthy
unless made falsey.

We can look at None as a example of this. If you want to use None as a
sentinel and treat it specially, it is always correct to use:

if obj is None: ...

The lazy shortcut:

if not obj: ...

can silently do the wrong thing unless you know you are working with a
restricted API that can only ever return truthy objects apart from None.

So by default, we should use the safe choice that sentinels are not
falsey, and let people opt-in rather than the unsafe choice and expect
them to opt-out.

steven.daprano · October 1, 2021, 6:43am

I agree, sentinels shouldn’t generally be callable. Which would require

overriding __new__ on the class, or __call__ on the metaclass, if

you wanted to use a class object as the sentinel.

pf_moore · October 1, 2021, 7:58am

If all your valid non-sentinel values are true, you can use None as a sentinel. If all your valid non-sentinel values are false, you can use 42 as your sentinel (or True, if you’re boring ). In any case where you genuinely do need an explicit distinguished sentinel, truth testing is guaranteed to be the wrong thing to do. So why not make conversion to bool an error?

Paul

taleinat · October 1, 2021, 8:28am

I must say that seems like a reasonable approach.

taleinat · October 1, 2021, 8:31am

I think that could be very surprising, since currently conversion to bool works for practically all objects, including many where it doesn’t necessarily make sense (object(), classes, …). ISTM that for rather “normal” objects like sentinels would be perceived to be, it would be surprising to have conversion to bool raise an exception.