PEP 661: Sentinel Values

Well of course until type checkers don’t implement a particular feature it cannot be used to type code.
My reasoning is that that since this PEP doesn’t at the moment specify anything regarding type checking there is nothing for typing tools to implement. So before supporting Literal[MISSING] (or any similar syntax to type individual Sentinels) a new PEP would need to be created and approved.

2 Likes

Gotcha, thanks!

After additional feedback (here and elsewhere) I’m going to make another effort at finding a way to have type signatures better than just Sentinel.

Here’s my current thinking:

  1. It would be desirable for type checkers to be able to recognize that if a value isn’t a sentinel, they can deduce that is of a more specific type. For example, if a function argument’s type annotation is value: int | MISSING, and the function includes a if value is MISSING: statement, then in the else: we can be sure the type of value is int. This works now with None and likely also when using Literal.

  2. I’d like type checkers to have to do as little as possible to specifically support these sentinels. If they need to implement specific support, it should be reasonably simple and straightforward.

  3. The problem with using Literal (e.g. Literal[MISSING]) is that sentinel values are not literals (unlike string literals). It appears that type checkers would have to implement very different logic to support this than what is currently in Literal. I suspect that there would also be nuanced ways in which sentinel literal type annotations would work differently than those for strings, which worries me.

  4. Having a dedicated type for each sentinel is possible in various ways, as suggested in previous drafts of this PEPs and the implementation using a single-value Enum, for example. However, the type signatures would be non-obvious, and these wouldn’t achieve #1 above without special-casing by type checkers.

  5. Making sentinel values be their own types (e.g. using the method mentioned here by @njs and @BlckKnght) avoids needing a separate, dedicated class, which is fantastic! But using them for type signatures doesn’t work, because the magic involved apparently doesn’t work for static analysis. (Again, this could be solved by special-casing by type checkers.) And this also doesn’t support #1.

  6. All of which leads me to think about a new typing construct. It seems we’ll need special casing by type checkers anyways, so we may as well make it explicit. Also, I feel that re-using Literal for values which are not literals would be a mistake.

So, how about Sentinel[MISSING] or Sentinel['MISSING']? With Sentinel being the actual class, this makes sense: it’s a specialization of Sentinel. It parallels Literal, but avoids overloading it. It would be the most straightforward for type checkers to implement, since it would be a separate thing. And I’m hoping it would be obvious enough for readers of code to understand the meaning without needing to look it up.

Thoughts on this?

7 Likes

Sentinel['MISSING'] (assuming that’s supposed to mean that what’s annotated with it is Sentinel('MISSING')) seems like it would be problematic cross-module since you wouldn’t be able to tell it that you actually mean the sentinel created in another module (something that I think could be common, one example that comes to my mind is REST APIs that differ between None and not passing the field where you could want to have single MISSING sentinel representing not passing the field across the whole, potentially large, library). This could be solvable with a second argument when doing things cross-module but it seems like it wouldn’t be trivial to use.

I think Sentinel[MISSING] would make more sense from the 2 but I’m a bit unsure why that would be preferred over just doing MISSING directly as MISSING already is the specialization of the Sentinel class so it would obviously (in my mind, can’t speak for other people) mean that you’re referring to that specialization of sentinel class. Maybe there’s a reason type checker authors would prefer it though in which case I don’t mind having it written like that much. It would mean I have to import Sentinel in addition to MISSING in every place that I want to type hint something with MISSING though while I could avoid that if I could use MISSING directly.

It seems to me that regardless of the chosen method, type checkers will have to specialize for it so not having to special-case being a goal probably isn’t realistic. And type checkers already need to have a way to detect if something is an enum member so having to detect if something is a specific sentinel should work similarly (you have to detect call to Sentinel rather than a class definition subclassing from Enum so that part is somewhat different, there may be some other areas where that kind of detection is already performed though).

The (entirely?) new thing is the global registry of sentinels, the type checkers may now need to detect that doing x is Sentinel("MISSING") is the same as x is MISSING if MISSING is defined in current module (and in general track module_name). I know that ideally we would surely want that to work with type checkers as well but personally I’m not sure how often that’s going to occur instead of people doing x is MISSING. I imagine that tracking Sentinels instance across the codebases due to the global registry will make it quite a bit harder for type checkers so I personally don’t think that it’s essential for them to support it, even if ideally type checkers should be able to support new features in full.

I realized after typing it out that you made a nice numbered list of your thoughts and instead of addressing these directly I just went on about the Sentinel[MISSING] vs Sentinel['MISSING'] part :slight_smile:

So in short, I think that #1 is essential and I agree with everything else you said on that numbered list. For #2 I already said in a previous post that I don’t think it’s realistically possible to do without any special-casing but I support the idea of going with something that requires reasonably simple implementation on the type checker’s side (keeping in mind the kind of things they already need to implement to support other popular Python constructs such as the enums).

And thank you for reconsidering type checker support.

I assume Sentinel[MISSING] is supposed to be used like this:

MISSING = Sentinel("MISSING")

def func(value: int, *, flag: str | Sentinel[MISSING] = MISSING) -> None:
    if flag is MISSING:
        ...

But I don’t quite undersetnd what the string form means. Is it equivalent to Sentinel[MISSING]? This would not be a parallel to Literal, and why is it needed anyway (I can’t think of a use case for forward-declaring a sentinel)? Or is it supposed to mean using the literal string "MISSING" as a sentinel? That seems to make even less sense since the use case should be covered by Literal["MISSING"].

(These my personal opinions for your consideration. This is not a Steering Council verdict, you don’t need to convince me.)

The implementation still has a lot of “magic” which makes me worried about freezing it in the stdlib. Could it be simplified? Perhaps by introducing limitations that can be lifted in the future, as we have more experience with sentinels?

Is there use cases for pickling sentinels that aren’t module globals?

It looks like if __reduce__ checked that getattr(sys.modules[module], name) is self, and set a flag parameter for __new__ to use that expression, the registry could be avoided. (Well, replaced by sys.modules.)

Did you think about passing the module name explicitly? e.g.:

NotGiven = Sentinel('NotGiven', __name__)

Flask uses a similar pattern quite successfully.
IMO, pickle-ability is better determined by passing (or not) the module name, rather than by the Python implementation. But perhaps that view is too extreme.

The new module would break third-party modules named sentinels, like sentinels on PyPI (which looks maintained even though the last change was in 2016).

4 Likes

One advantage of the “class pattern” is that it also works with pattern matching; for example:

from typing import NamedTuple, TypeAlias

class Quit: pass
class Move(NamedTuple):
    x: int
    y: int
class Write(NamedTuple):
    text: str
Message: TypeAlias = Quit | Move | Write

def process(message: Message) -> str:
    match message:
        case Quit():
            return "quitting..."
        case Move(x, y):
            return f"moving to {x}/{y}"
        case Write(text):
            return f"writing '{text}'"

If I understand this proposal correctly, this wouldn’t work with sentinel values because you can’t pattern-match on a constant (it would get captured instead).

(The “single enum pattern” also works with pattern matching because it’s a dotted name.)

I guess my point is that it would be nice to use a light-weight sentinel instead of a full class for this use-case, but I don’t see a way to make it work, because of the pattern matching rules…

2 Likes

Hi @taleinat, how are you doing? I came looking for a way to do algebraic data types like in Rust, found your PEP, and got to this discussion.

I really like the use case of @tmk. I think that it’s important that the type signature when using sentinels would be something like Quit | Move | Write, and not Quit | Sentinel.

It’s nice that @tmk 's example works with current Python/mypy. One annoyance of using it is that you need to distinguish the value, Quit(), from the type, Quit. You don’t need this with None. Also, Quit() == Quit() is False.

Could we have a base class typing.Sentinel that would allow you to write:

class Quit(Sentinel): pass

And which would make Quit() return the class Quit? Then you would be able to use Quit instead of Quit() in the match block in the example. Quit would basically behave just like None.

2 Likes

Hi @tmk and @noamraph!

The major issue with enabling using Quit | Move | Write in type signatures and matching, is that the signatures are based on the types of the objects. To get what you’re asking for, Move would have to be its own type.

This works for None due to special casing, but that can be done for None as it’s a predefined single special case. I believe it would be much harder to do this for sentinels in general.

If anyone can help think of a nice way to achieve this, I’d definitely be interested!

Hi @encukou, thanks for the inputs, greatly appreciated!

I’ve tried to keep this as simple as possible. Suggestions that would simplify further would be welcome.

Yes, pickling (and copying in general) are useful for many things, such as serialization to disk, RPC and IPC. Having objects which don’t behave as expected after pickling+unpickling or copying leads to delicate, hard-to-debug bugs.

I’ll think about this some more and follow up. For now, note that getattr(sys.modules[module], name) requires that values are set at the module level, and the variable name used must match the name passed in to the constructor (similar to namedtuple).

Yes, but that is unexpected, and it is unnecessary in the most common cases. It’s more generic but makes the API more complex and readers of the code need to learn more. I’d rather avoid that.

I think that’s nearly unavoidable these days, except if we add this to an existing stdlib module, which seems reasonable to me.

I would optimize for the best user experience without rejecting solutions because they might be a little more work for type checkers. You are free to propose new type system features in your PEP, and people from the typing community will tell you if something is prohibitively hard to implement.

8 Likes

I agree that user experience should come first here. But it’s not clear that having e.g. Move act like its own type in signatures is the best experience; it could be confusing since its different than all other objects, except None.

Something like Literal[Move] or Sentinel[Move] would be more verbose, but seems clearer to me. What do you think?

Also note that @tmk and @noamraph brought this up in the context of match statements, which is directly about Python code execution (not only type signatures).

2 Likes

Personally my preference would be to use just Move, as that’s the most concise possible annotation and still unambiguous. Literal[Move] would be a reasonable extension of current uses of Literal, so that would be a good choice if there’s resistance to just using Move. I don’t see much reason to introduce a new primitive with Sentinel[Move].

3 Likes

I just saw this thread and PEP by chance. We do need an easy way to create what is essentially object() with a given name and optional repr:

NoValue = sentinel('NoValue', '<no value>')

Just give the result a distinct type that correctly handles copying and pickling, but otherwise behaves like object(). This would address all points raised in the PEP. For more complicated problems, there are more complicated solutions.

Hi @ntessore,

Indeed, that is the goal of this PEP. I am very much trying to keep things as simple as possible.

The discussion right now is about what the type signature for such sentinels would be, and also how they would be matched in match statements.

Indeed, that’s what I filed under “more complicated problems”. All the while we have to continue relying on object() or other hacks, when we could have had a solution to the problems the PEP ostensibly tries to solve in the standard by now.

Besides, I suspect that if we had such a simple sentinel() with plain object() semantics, it would be so immediately useful that it wouldn’t take long for the type checkers to support it from their end.

What was the reason that people requested NoneType (e.g., `Type[None]` should return `NoneType`, not `Type[NoneType]` · Issue #451 · python/typing · GitHub)? Would people also ask for the type Move (rather than the literal)?

2 Likes

I found out that it’s pretty easy to make a class behave as if it’s its own type - so isinstance(Quit, Quit) would return True:

>>> class QuitMeta(ABCMeta):
...     def __repr__(self):
...         return 'Quit'
... class Quit(metaclass=QuitMeta):
...     @classmethod
...     def __subclasshook__(cls, C):
...         if C is QuitMeta:
...             return True
...         else:
...             return NotImplemented
>>> Quit
Quit
>>> isinstance(Quit, Quit)
True

For the fun, I managed to make a class Sentinel which allows you to write:

class Quit(Sentinel): pass

and get the same behavior! Here it is: Trying to make sentinels which seem like the type of themselves · GitHub

2 Likes

Does this actually work in a match case?