PEP 661: Sentinel Values

steven.daprano · October 1, 2021, 9:25am

Yes, you can use None, but perhaps you want to use a more appropriate
sentinel that has a better repr.

There are a handful of concepts and operations that we should be able to
rely on for every object, and if they fail, that ought to be considered
a buggy object.

Everything in the language has a value, an identity, and a type.
Stringification: str(obj), repr(obj), print(obj)
Equality and inequality tests: obj == other
Containment: obj in container

(That is, everything can be in a container. Not everything is a
container.)

And I believe that interpreting objects in a boolean context should be
considered a fundamental operation too. (Yes, that implies that I think
numpy arrays are broken. So sue me.)

There are, in my opinion, only two consistent models here:

The strict “real booleans” model, where only True and False can be
used in boolean contexts. Like Pascal. Or at least a weak form of that,
where ints can be used, with 0 representing false and all other ints are
true (usually with 1 or -1 as the canonical true).
Or the Python/Ruby/Javascript model where we test for objects that
quack like a bool rather than are bools. Everything is either truthy
or falsey, and the default inherited from object is truthy.

Python usually imposes an especially consistent and useful model on
that, often abbreviated as “something versus nothing”. But other
languages have their own models which I trust makes sense to them too.

A hybrid model with “(nearly) everything is truthy or falsey, except for
this list of exceptions” combines the worst of both worlds. It
violates Least Surprise, because just when you get used to thinking that
everything duck-types as a bool, bang something blows up in your face
with an exception.

We can’t do anything about third-party libraries like numpy, but in my
opinion we shouldn’t follow them.

“Truthy, falsey or whoops I just got an unexpected exception” is not
even a proper three-valued logic like SQL uses

pf_moore · October 1, 2021, 9:58am

Fair point. But that does imply that we will be allowing if obj where obj could be a sentinel, but (strongly¹) discouraging actually using that construction. I guess that’s an acceptable compromise to avoid more exceptions like numpy’s approach to bool.

¹ If we’re advising against equality comparisons with sentinels, assuming “if it’s false (or true) then it’s a sentinel” seems even less reasonable…

AlexWaygood · October 1, 2021, 10:04am

I strongly agree with this. I would still prefer for sentinel values to be falsey by default — it feels like it “makes sense”, and is consistent with the most common sentinel value, None. Being consistent with None is important, I think, as it will allow people to easily replace None in existing code if they’d prefer a sentinel value with a better repr.

But I don’t have a strong objection to sentinel values being truthy by default if that’s the consensus. I do have a strong objection to bool(Sentinel value) raising an exception, for the reasons Steven mentions.

AlexWaygood · October 1, 2021, 10:18am

One thought: since the preferred way to test for sentinel values will be to test for identity, it might be nice for sentinel values to come with two convenience methods, is_ and is_not:

def is_(self, other):
    return self is other

def is_not(self, other):
    return self is not other

This would allow for easy filtering of sentinel values using functional idioms:

NotGiven = sentinel('NotGiven')
data = ('spam', 'eggs', 'bacon', NotGiven)

for item in filter(NotGiven.is_not, data):
    # do something with the data here

This would be much more concise than the alternatives, either this:

for item in filter(lambda x: x is not NotGiven, data):
    # do something with the data here

Or this:

from functools import partial
from operator import is_not

for item in filter(partial(is_not, NotGiven), data):
    # do something with the data here

Or this:

for item in (x for x in data if x is not NotGiven):
    # do something with the data here

Or this:

for item in data:
    if item is not NotGiven:
        # do something with the data here

steven.daprano · October 1, 2021, 11:49am

I wouldn’t say that we are strongly discouraging bool testing of

sentinels. It works fine for regexes (match objects versus None) and I

don’t think we need to discourge that pattern even if technically it

would be more correct to write:

mo = re.match(pattern, text)

if mo is not None:

    ...

Earlier I pointed out that I don’t usually care much about the

truthiness of my sentinels, but in the most recent case where I wrote

one, I did care and I wanted it to be truthy. But there could be cases

where I, or others, have good reason to want it to be falsey.

Just thinking out loud here… suppose you had some sort of linked list

or tree of Node objects, and there was a NullNode sentinel that offered

the same API as regular Node objects, but was falsey instead of truthy.

Then maybe you would traverse the tree with:

while node:  # not node implies node == NullNode sentinel

    traverse(node.left)

    print(node.payload)

    traverse(node.right)

or some such thing… (it’s been a while since I’ve written my own tree

traversal code). I don’t object to that.

Or maybe you want node’s truthiness to be linked to their payload. It

surely depends on the API you are designing, and I don’t think that we

should explicitly discourage or encourage either design.

EpicWink · October 3, 2021, 2:20am

A pattern I use often for dynamic function defaults is

def f(x=sentinel):
    x = x or get_default_x()
    ...

Where sentinel is necessarily falsy. I would like to at least have a supported way of defining the sentinel to be falsy, and use it in a way that’s not discouraged.

steven.daprano · October 3, 2021, 6:56am

Let’s hope that nobody ever passes zero, an empty list, an empty string,
or False to your function f, expecting to actually use zero, etc.

taleinat · October 3, 2021, 8:43am

Very well put! I tried to say something similar, but more succinctly, and it came out much less clear.

Could you provide a link to what you’re referring to in NumPy?

taleinat · October 3, 2021, 8:51am

From the discussion here it seems that there are valid use cases where a sentinel value is intentionally meant to be used this way. I’ll consider adding a suggestion to use if value is not NotGiven vs. if value, but I won’t make it too strong.

taleinat · October 3, 2021, 8:54am

Sorry, but I don’t intend to include that in the proposal. I appreciate the suggestion and that you made the effort to show why other methods have drawbacks that this would overcome, but I still feel that the benefit would be too small to justify adding complexity.

taleinat · October 3, 2021, 8:56am

Thanks everyone for the great discussion on truthiness of sentinel values! I’ll be adding is_truthy=True to the signature of sentinel().

pf_moore · October 3, 2021, 9:53am

It’s a bikeshedding point, but I find the term “truthy” a bit jarring. Would bool_value=True work instead?

taleinat · October 3, 2021, 10:17am

Now that you mention it, I agree.

That seems clear. I’m also thinking about considered_true=True since that’s the wording used in the docs.

steven.daprano · October 3, 2021, 2:32pm

Regarding numpy, I am referring to this:

a = numpy.array([1, 2])

bool(a)

which raises:

ValueError: The truth value of an array with more than one element 

is ambiguous. Use a.any() or a.all()

I wasn’t party to numpy’s deliberations on this behaviour, or why they

decided to do it. I’m sure that they have their reasons. I just don’t

that we should emulate that in built-in or stdlib types.

I acknowledge that there is a good case made to make the NotImplemented

singleton an exceptional case. I’m not satisfied by that argument, but

others have been.

https://bugs.python.org/issue35712

taleinat · October 3, 2021, 3:01pm

Thanks for the clarification, @steven.daprano.

Worry not, my aim is to make sentinel objects as simple and normal as possible, and I don’t intend to make them raise exceptions in a boolean context.

uranusjr · October 3, 2021, 6:46pm

Some peripheral thoughts on this. Many people literally despise the x if x is not sentinel else default syntax. They do this do this not because they don’t understand the potential bugs it introduces, but are willingly taking it because it just looks better. Call the irrational or whatever, they will just ignore the recommendation no matter what. I’ve always felt that instead of pushing a recommendation (which won’t work anyway), a solution should include an “attractive enough” alternative to dissuade them away from the worst solution.

If the sentinel value is a built-in language feature like null in many languages, the solution would be to revive PEP 505 (None-aware operators) against the sentinel instead. But the sentinel value proposed here is only an add-on type, so alas this is not possible. Maybe we could revisit the possibility if the sentinel value catches on, or PEP 638 (Syntactic Macros) goes anywhere so we can have something like

x = resolve!(x, get_default_x())

steven.daprano · October 4, 2021, 5:38am

Professional programmers deliberately writing code they know is buggy
because “it looks better”?

That explains a lot.

encukou · October 4, 2021, 8:49am

Is this really worth adding more complexity for the user?
Any of the proposed defaults (true, false, or error) would work for nearly all of use cases, and the option won’t matter if singletons are used with is, as they’re meant to.

taleinat · October 4, 2021, 9:06am

“This isn’t worth the additional complexity” was also my initial reaction (see earlier in this topic). However, more than one person has written here mentioning reasonable use cases where they’d need to control whether a sentinel is true or false in a boolean context. This proposal is for a tool for defining sentinels, which I hope to be useful for all such use cases; I wouldn’t want developers needing to hack it or use another implementation just because they can’t control “truthiness”.

encukou · October 4, 2021, 9:28am

Sure, but:

would a fixed default actually prevent them from using it, or would it just be a minor annoyance?
is the use case known at sentinel creation time, by the person/library that creates the sentinel? (And should they be expected to think about picking the correct default?)