FYI, especially @Jelle@barry: I’ve created a PR with a few small changes to this PEP following suggestions in this discussion thread. Hopefully this is early enough to not complicate the SC review. If possible, please consider the PEP proposal to incorporate the changes in the PR.
The changes:
Simplify the suggestion further by making it more opinionated: remove the options to customize the repr and the truthiness.
Use “sentinellib” rather than “sentinels” for the proposed module name, to avoid clashing with an existing popular package on PyPI.
This could also be solved by requiring the Sentinel to be a a top-level module attribute.
I’m not sure about the benefits of parallel, sentinel-only namespace. Is it to avoid collisions with non-Sentinel variables? Or to make it easier to work with non-identifier names?
No, the registry is not intended to avoid collisions with non-sentinel varialbes or to work with non-identifier names. It is mainly intended to enable the simple and robust handling of unpickling/copying, without the requirement for the variable to be defined on the top level of a module or class definition. I specifically would like it to be possible to define sentinels in a function or method, too.
I realize that it is unusual; the number of questions about it has been surprisingly large. I also realize that it is different than how unpickling/copying support is done for other built-in and stdlib utils such as namedtuple. If this break with current convention is considered too large, it can be changed; the registry is, after all, an implementation detail.
I still currently prefer this proposal as it currently is, though.
As someone who has actively used various PyPI available implementations of Sentinels value/objects for years now… I’ve got more thoughts as the conversation has gone on.
Having a built in callable that didn’t need to be imported would be great.
But other than that I’m kinda looking at this PEP as it stands and thinking its a step back over what I have now. If you combine the forced truthyness of the new draft and the lack of subclassing which has always been there… for about 50% of my use cases I’m basically going to have to import a different library instead of the standard library or implement my own…
One of the main points of sentinels is to guard against improper value passing, and while i get that some people have reasons for wanting implicitly True or False like behaviour as we currently have from object() and None… having the sentinel objects be comparison unique like None while True for boolean comparisons like object()… robs it of the utility and safety benefits I’d expect of a good sentinel.
I get that its odd to have a value that isn’t True or False, but thats kind of a sentinel value’s job… To be odd and unique and require explicit handling around cases where the value is a Sentinel.
The proposed implementation will require significantly more use of Look Before You Leap style coding where up front you have to check if you were passed a sentinel and make decisions about the rest of the function or module logic, or predicate all potential comparisons with code like if argument is not sentinel_value and, in order to prevent passing on values that later code or called functions, could miss-interpret as True for the sake of data processing decision making, or any of the other ways accidentally passing in a Truthy value in the wrong argument can cause miss-adventures.
It really feels to me like it should either not compare to True or False, or allow the behaviour to be turned on.
How would this be implemented? Having the sentinel’s __bool__ raise an exception? That seems far too finnicky to safely use. I think it’s fairly agreed upon that a particular sentinel should always be truthy or falsy. Which one is ideal depends on what you’re doing with it, but there also is advantages to having uniform behaviour across all sentinels. I think both the current truthyness or None-like falsyness are reasonable choices and both will need some amount of special handling in code.
I think that ultimately truthyness just is too broad of a concept to have perfect ergonomics. Ideally, we’d want a single if some_value: ... guard at the top of our code to filter only values we actually care about, but which ones those are is too context dependent to be broadly capturable. I really like the current proposal’s handling of this. It solves the really annoying and common cases where you just want something that’s different from other stuff, and has reasonable uses in more specialized cases. In my personal experience the cases where this proposal wouldn’t properly replace current sentinels are very rare.
I think it’s valuable to make __bool__ raise an exception. I think it can be quite safe.
from typing import NoReturn
class S:
def __bool__(self) -> NoReturn:
raise Exception()
def foo(x: int | S):
if x: # My editor reports a problem here, because it might raise an exception.
print(x)
That’s much safer than a bug that comes from accidentally evaluating the sentinel as a boolean.
Thanks for your inputs Sam, I find them very valuable!
Several people have mentioned that recently; it’s good to get your “+1” on this.
Perhaps that’s because the goals of the sentinels proposed in this PEP are different from those of e.g. the “sentinels” package on PyPI. This is trying to be as simple as possible, both in implementation and in use, while being useful for re-implementing existing sentinel values in the stdlib and elsewhere.
I imagine that to support those other 50% of your use cases, you would need something much more generic, allowing for more customization. That goes against the above-stated goal of simplicity. I’m very intentionally not aiming for an ultimate solution for all Python sentinels.
You’re right, and I had considered that, it certainly makes sense in a way. But there’s nothing else built-in or in the stdlib that behaves this way (except for one edge-case I know of, NotImplemented, which has a very specific and unrelated reason for that.) I don’t consider this to be important enough to justify introducing something so unusual and potentially confusing into the Python stdlib/builtins.
The argument for sentinels to be falsy, so that e.g. if some_value: can be used, makes sense too. But, since this proposal currently strongly suggests using e.g. if some_value is not MISSING instead, I’m not sure that’s actually a good argument.
I am considering bringing back the option to set the truthiness to True or False, and perhaps the repr. That would enable using the new tool to better implement more existing sentinel objects. The added complexity of those is really quite minor. I wonder how much of the extra 50% of your use cases that would cover?
Overall I’m quite torn about truthiness. There are good arguments for all options. But I think always being True wins out:
It is very simple.
It is consistent with existing sentinels other than None, such as object() and Ellipsis.
It avoids an issue with falsy values enabling subtle bugs due to using if some_value instead of if some_value is not MISSING, when other falsy values are also possible.
It doesn’t introduce “radical” new behavior.
Yes. But for the foremost use case of default/special values for function/method arguments, that logic will normally be confined to that function/method rather than the code calling it. So this would be one small bit of more verbose code, and in my opinion that is a small price to pay for the extra clarity and robustness.
And I should probably mention that using sentinels other than None is not common: I’m certain that far fewer than 1% of Python functions/methods use them, or even would use them given something like in this proposal. This is really intended for edge-cases, rare but just common enough that having a utility as proposed would likely be a net benefit. Hence, I don’t think requiring slightly more verbose code to handle them is a relatively significant consideration.
I think it is worth the very minor added complexity to make these customizable.
For libraries which already expose their defined sentinels, it will ease the transition. Otherwise, marshmallow.missing (for example) cannot possibly change implementations until there is a major release. Depending on the library, that could potentially add years of transition time before we see any benefit from this PEP. I think it’s a significant downside to only-true behavior
Why are you dropping NotImplemented from this listing? It is after all a core python sentinel, probably the second most used after None.
IMO being able to raise an error on __bool__ is quite useful and not exactly unusual, just rare in terms of situations where it’s appropriate. Core python does it, numpy does it (for arrays).
Nor need you. For example, even though there’s an enum module in the stdlib, I am still supporting my flufl.enum package because I think it’s simpler and more narrowly focused in its use cases.
And I think None is special here, so making these sentinels truthy consistent with all other sentinels makes sense to me. I strongly agree with preferring explicit identity checks[1]. I don’t think __bool__ needs to raise an exception, but it’s your PEP so you decide!
interestingly enough, that’s the same with enums ↩︎
I would add that usually I use object() as a sentinel if I can’t use None. And if Sentinel will be added, I’ll use them instead of object(). So they seems to me more related to object() that to None, and IMO their truthiness should be True for this reason.
I’ve read the thread, and this consistency argument seems to be the main justification for returning true.
To me, leaving the Boolean method undefined is more useful.
As many people have noted, explicit identity checks are preferred.
Therefore, it comes down to the preference of consistency with other objects versus the utility of raising an error for a “weird” way of using sentinels.
I prefer small interfaces, and consider such errors useful in protecting developers from errors. Therefore, I like the error.
I think the truthiness of Python objects makes sense for things like containers, numbers, and match objects.
Another advantage to raising is that if you ever change your mind, it’s easy to go from raising to returning true or false. It’s hard to go the other way without breaking code.
Does anyone have an example of where the truthiness of a sentinel is even useful? The primary comparison should obviously be identity. In what context would it be sensible to check the truthiness?
I don’t think people are explicitly advocating for truthiness in the sense that they want to use the fact that it’s truthy/falsy, but rather that so far every similar object has had a defined truthiness and the intuitive behaviour would be for this to mirror that. I think there also is broad consensus that good code shouldn’t really use the sentinel’s truth value or equality comparison. The question is how to best support the non ideal cases where people still do that.
Having read through the responses and thinking about it more, I’m not sure if I still think forcing them to be truthy is best. My initial thought process was that there’s too many parts of python where something will implicitly call __bool__ for it to reasonably raise an exception. But thinking about it more, I’m not sure how true that is. In the easy case where you use it as a default value in an argument that certainly isn’t the case. You want to catch that option as soon as possible with an identity check and then only further process the non-sentinel values. But what about something where you have a bunch of computations that can return sentinel values and you throw them into some collection? I can’t really think of a reason why the collection should check the truth value of things in it, but I also can’t really guarantee that it doesn’t. Even if it’s just something like whoever implemented it having had some application specific optimization in mind and thinking that it couldn’t do any harm since surely no one would write a __bool__ that raises exceptions.
Is that a use case that sentinels should actively support? Is that more concerning than the certainly far more common case where someone accidentally writes if something: ... rather than if something is not SENTINEL: ...? Personally, the safest option feels like having a customizable truthiness with truthy/falsy/exceptiony all supported and it defaulting to exceptions. Then the common error cases get caught quickly and people are nudged to idiomatically writing identity comparisons. But if the error raising does lead to issues in downstream code you can make the intentional choice to opt into less safe but more traditional semantics. Having said that, I can see the arguments that that would create unnessecary complexity and/or is too much a deviation from other standard library behaviour.