PEP 661: Sentinel Values

Interesting, this makes the implementation easier. Is this a documented feature though? The documentation only mentions:

the pickle module searches the module namespace to determine the object’s module.

but it is unclear what module namespace refers to.

I have considered this. If writing x is not y twice is too much then one could check a container instead:

def foo(x: MyClass | None | NULL):
    if x not in (None, NULL):
        return x.y
    return x

Or do truth testing after identifying the sentinel, in case MyClass potentially being False was intended:

def foo(x: MyClass | None | NULL):
    if x is not NULL and x:
        return x.y
    return x

Or because attributes of MyClass are being accessed then check for that type instead of the opposite:

def foo(x: MyClass | None | NULL):
    if isinstance(x, MyClass):
        return x.y
    return x

Pattern matching also works:

def foo(x: MyClass | None | NULL):
    match x:
        case MyClass(y=y):
            return y
        case _:
            return x

As much as I want TypeError to be the default I’m not opposed to the truth value being set explicitly:

NULL = Sentinel("NULL", bool=False)

In theory being able to set truthiness is required to interact with external libraries in obscure situations, but the example parameter type of MyClass | None | NULL leaves no room to add a new sentinel from downstream code, and any code replacing a sentinel could fix their abuse of truthiness at the same time with maybe the assistance of a type checker.

Custom truthiness might put some burden on type checkers to figure out which sentinels are narrowed by truth checks. I’m unsure how much that matters.

1 Like

No, I’m talking about sentinels being provided by libraries as part of their documented interfaces. That’s why I said that changing this would require those libraries to make breaking changes, which requires that they go through whatever their process for that is, just to change their implementations.

To put a finer point on this, for some use cases this means never changing, since the benefit of nicer type annotation behavior will never offset breaking code which has worked for a decade.

Also, philosophically, calling it an “abuse of truthiness” seems unduly hostile towards prior art. We’re talking about making use of truthiness, a property exposed by the language and leveraged by a library and its consumers. “I would personally do it differently” isn’t the same as it being misuse of the language. __bool__ is a part of the data model and users get to define and use it as they see fit.

9 Likes

Not directly from the pickle docs, but __module__ itself is documented in multiple places.

I assume it’s checking the module namespace referred to by __module__, but if __module__ doesn’t exist then ~ pickle might fallback to other methods of determining the module namespace? The search is a sanity check to ensure that the singleton currently being reduced exists at the place it says it does before it is serialized.

This was something I ended up looking at Python’s internals to figure out. That is how I found __module__ in the first place. Would be nice if the pickle docs were more specific about this.

__module__ is documented, but as being a class attribute (see definition.__module__ and type.__module__).

This works because the pickle has a whichmodule() undocumented function that happens to fetch the __module__ attribute on the instance (see the Python impl. for instance). But as it is undocumented, at any point it could use type(obj).__module__ instead. If others can confirm using this proposal is ok, I can open an issue to clarify this (perhaps by explicitly documenting the behavior – I’ll note that I’m not sure this will be accepted).

Reading your post with more comprehension, it sounds like your suggestion is to preserve custom sentinel behavior from existing libraries as-is while letting them benefit from the new type-hinting support added by official sentinels. In that case the ability to configure the truthiness of a sentinel is not even remotely optional, it must be part of the new standard. I’d agree with this.

I have theories of how to implement this, but it’s only from the perspective of a user of type-hints and library writer and not as someone who actually develops type linters.

We’ll need to decide on the name and values for a boolean parameter. I suggested bool= but I’ve seen many others. The values of True and False are obvious, but what about the value for raising an error on truth tests? I suggested NotImplemented but there might be other options. I’ve considered having the exception itself to be raised as the value but I dismissed that idea.

1 Like

I consider it an error that whichmodule() looks up __module__ on the instance, not the type. I haven’t looked at the code yet, but I think the rule is that all dunders are looked up on the type, not instance.

3 Likes

I did not realize how much undefined behavior I’ve tapped into. How is one actually supposed to reduce a custom singleton?

2 Likes

As far as I understand, the second case doesn’t actually work. NULL, unlike None, is a valid capture pattern, which results in SyntaxError:

>>> match x:
...     case None | NULL:
...         print(NULL)
...         
  File "<python-input-1>", line 2
    case None | NULL:
         ^^^^^^^^^^^
SyntaxError: alternative patterns bind different names

(and the PEP does not mention altering match-case behavior with regards to that in any way)

1 Like

Mistake or not, I believe this behavior is widely relied upon in practice by the language core (e.g., for pickling function and class objects that are present in the module scope), so it seems perfectly fine to rely on this behavior for sentinels.

2 Likes

Nearly all dunders, but not all. For some dunders, it’s important that the dunder needs to be looked up on the instance, or the hook wouldn’t work as intended. __mro_entries__ and __class_getitem__ are two examples. They’re definitely the exception rather than the rule, though.

2 Likes

I would go with bool=True|False|None|<exc> for a standard exception to be raised when bool is None (such as NotImplementedError), or a custom exception.

Seems like a clean solution to me.

1 Like

I think reusing/misusing NotImplemented instead of None makes more sense here and leads to less confusion when reading the code.

The SC’s message said it was fine to have all sentinels be truthy. I’m wary of adding too much complexity to the feature. It doesn’t need to cover all sentinel-like use cases that someone might have somewhere: if you need something complicated, you can always roll your own.

I’m sympathetic to @sirosen’s need for sentinels that are falsy so I’m OK with adding a parameter bool=True|False. But now there’s also suggestions that we need a way to have a sentinel that is not convertible to bool, and even that we need to allow a custom exception to be specified. I feel that’s too much for now, though we can always add additional features later.

12 Likes

Something as basic as this should have simple well defined and easily understood behaviour. Making the bool behaviour configurable seems like a sort of design by committee outcome where we end up with something overly complex but really what is needed is for someone to just make a decision.

It is true that any particular behaviour may not match all current uses of sentinel-like things but it is better just to define the use cases that are intended for this in future and choose suitable behaviour that matches those. If that behaviour does not match some current use then there are plenty of other ways to make a sentinel that do.

The only thing that is special about these sentinels is that they are proposed to be special cased by type checkers. That should not be a reason to extend or complicate them beyond the simplest thing that makes sense for the basic use case of sentinels.

It is already possible to do almost all of what you would want from sentinels using enums with this one exception:

from __future__ import annotations
from typing import Literal, reveal_type
from enum import Enum

class Null(Enum):
    NULL = 0
    def __bool__(self) -> Literal[False]:
        return False

tNULL = Literal[Null.NULL]
oNULL: tNULL = Null.NULL

def f(x: int | tNULL) -> None:
    if x is oNULL:
        reveal_type(x) # tNULL
    else:
        reveal_type(x) # int

As far as I can tell the one thing that the proposed sentinels can do that enums cannot is that you cannot make tNULL and oNULL be the same object NULL so that someone could do:

from foo import NULL

def g(x: int | NULL):
    if x is NULL:
        ...

(Maybe this is already possible and I just don’t know how to do it?)

I think it would be reasonable to propose that type checkers be able to handle enums like that and then people can have all of the possible behaviours that have been bikeshedded over many years above. The stdlib enum type is infinitely configurable and vastly more complicated than anything I have ever wanted when reaching for something enum-like so all of the possibilities are there.

The purpose of sentinels is to be something very simple and basic and well defined. Configurable behaviour is overkill that goes beyond what the purpose of these objects is.

10 Likes

I firmly believe that if (as the SC suggested) configurable truthiness of a sentinel is too much complexity for this proposal then __bool__ should raise an exception. As has been previously stated quite a couple times now higher up in this thread, by making it raise an error by default it leaves us open to changing this later without backwards compatibility issues.

I’m also sympathetic to the need for custom truthiness, but I’d want to see that as a future discussion and decision separate from this proposal, to ensure it gets a reasonable amount of discussion on it’s implementation without impacting the completion of this proposal.

6 Likes

Edit: @MegaIng points out below that we could just make the current behaviour the default behaviour of any future change. Although I think it would be pretty awkward to have to teach people later on that something that used to raise an Exception now can be boolean sometimes, I stand corrected and this comment doesn’t make much sense.

I don’t agree that letting __bool__ raise an exception avoids backwards compatibility issues later on.

Although contrived the following examples shows how the exception could be relied upon. UNSET is a Sentinel and UserWarning an Exception defined by the module/package.

class MyClass:
    def __init__(self) -> None:
        self.value = UNSET

    def set_value(self, value: bool) -> None:
        self.value = value

    def get_value_as_str(self) -> str:
        if bool:
            return "Yes"
        return "No"

    def something_more_complex(self) -> None:
        ...

        try:
            str_repr = self.get_value_as_str()
        except NotImplementedError:
            raise UserWarning("Called complex method before setting value")
        
        ...

I have definitely seen patterns like this, where a certain method needs to be called, before calling another method is allowed/safe.

I don’t have a strong opinion on what Sentinel.__bool__ should do, but believe strongly that we should pick something that we are happy to stick with indefinitely.

1 Like

There is no backwards compatibility concern; Whatever behavior we choose now can be the default later on.

I and a few others are of the opinion that making it an exception is the correct behavior for most people most of the time. Relying on truthiness is most likely just a bug magnet if you already have a sentinel that is explicitly distinct from None.

4 Likes

I’m pretty sure you aren’t missing anything and that it’s not possible. If x: NULL is used, that means NULL is the type of x. So x is NULL can only be true if NULL is its own type, which isn’t something the type system allows.

I’ve used enums in order to get x is NULL to narrow under type checking. You can even use a bespoke sentinel class and then declare the enum under if TYPE_CHECKING to get the type-checking goodness but still be compatible with existing code at runtime. But you still end up writing

def f(x: int | NullType) -> None:
    if x is NULL: ...

which is fine, but would be nice to simplify to one symbol instead of two.

This I agree with, although I think I draw a different boundary in terms of what counts as “configurable behavior”.

I don’t think a static value for __bool__ is in the same category of configurable behaviors as having arbitrary method definitions attached to sentinels. IMO, this sentinel still satisfies the “no configurable behaviors” but has a static setting:

UNSET = Sentinel("UNSET", bool=False)

I would also be open to more static settings in theory, if well-motivated, e.g.:

UNSET = Sentinel("UNSET", bool=False, repr="<foolib.UNSET>")

And yes, I can see the argument that “behavior vs setting” is a distinction without a difference. But the proposal already supports setting a name for the sentinel, so I don’t think it’s so clear cut.


My reason for wanting bool=False is specifically to let a variety of use cases which are currently implemented with custom sentinels adopt this more easily. It is purely practical and not based on any theoretical desire for “configurability”.

If it’s rejected, I’d prefer for the rationale to be “those libraries can continue to use their custom sentinels, which have very minor defects but mostly work”, and not “we can help libraries later”. A clear-cut “No” is much easier to plan and work around.

9 Likes

Gosh this is a long thread. :sweat_smile:

I wonder how many people here agree with the PEP8 guidance that comparisons to None should always use is rather than equality? Given that it’s in PEP 8, I think it has permeated most linters and they default to complaining about checking the truthiness None. This seems to be generally accepted.

Disallowing __bool__ for PEP 661 sentinels is enforcing that rule in code, rather than in a linter. That seems reasonable to me. If None were to be introduced today, it could be given the same behavior–it’s just historical that we have to clean it up later.

7 Likes