PEP 724: Stricter Type Guards

rchiodo · September 19, 2023, 4:06pm

PEP 724 - Stricter Type Guards was introduced to try and make TypeGuards align closer to how isinstance behaves.

Feedback appreciated

gandhis1 · September 19, 2023, 4:51pm

In the interest of a productive discussion, are there any areas of this PEP that you might eagerly highlight as having some semblance of controversy or ambiguity? Everything I have read to date across the PEP PR, typing repo discussions, etc. has been fairly consensus in nature.

pf_moore · September 19, 2023, 4:54pm

Can you provide links, or ideally a summary, of previous discussions? Are they on Discourse or the mailing list and I missed them?

rchiodo · September 19, 2023, 4:56pm

The PEP has a link to previous discussions but here’s some links:

Introduction of the idea:
Mailman 3 Type narrowing for TypeGuard in the negative case - Typing-sig - python.org

More discussion:

github.com/python/typing

TypeGuard type narrowing on returning False

opened 03:44PM - 27 Dec 21 UTC

ikamensh

topic: feature

I've faced limitation of not narrowing types when trying to add types in https:/…/github.com/openai/gym/blob/8e5ae02ab13a89c976ce7b1278c21f755dfa4bd2/gym/spaces/box.py#L33 - (np.isscalar)[https://numpy.org/doc/stable/reference/generated/numpy.isscalar.html] returning False rules out `SupportsFloat` type, and given input type `SupportsFloat | np.ndarray` I now know it was narrowed down to np.ndarray. Numpy has not added type guard yet in its stub, but this is the logical type for this function once all supported python versions will have it. Yet with current syntax it's not possible to describe the function narrowing type by returning False. Proposal: give TypeGuard second optional argument: TypeGuard[A, B] should narrow argument type A | B to A if returning True, and to B if returning False. Old syntax should still work, where one can have TypeGuard[A] work as it does currently. I didn't find this in rejected ideas of https://www.python.org/dev/peps/pep-0647/, so wanted to share. I'm not sure how much of stdlib could use this, but it definitely helps to be able to write function in the most semantically convenient way. I can express both "is_x" and "is_not_x" function in typesystem if this is added. This could have synergy with NonType currently being discussed. Unclear yet: how best to describe situation where no type narrowing occurs on True return, but only on False. Maybe `TypeGuard[Any, B] ` could have this semantics.

pf_moore · September 19, 2023, 5:44pm

Thanks. It all seems a bit abstract to me. I don’t have any objections as such (my use of typing isn’t even close to this level of complexity), but the lack of any real world examples of how this would be useful in actual code bothers me a little. This isn’t specifically about this PEP, it seems like a general tone with typing discussions that I’ve encountered - they focus a lot more on theory and abstract examples than “what real-world usages will this enable?”

The example code given in the github issue (gym/spaces/box.py) doesn’t even have type annotations, so I’m confused as to why it’s relevant. Surely a PEP like this should be making existing, somewhat clumsy and/or unnecessarily broad annotations a little more accurate, not making the difference between having annotations or not annotating types at all?

rchiodo · September 19, 2023, 6:14pm

We’re meeting today actually for a typing sig meetup. I hope there’s a place to put the slides? I’ll have to ask around.

My guess is the most controversial part of the PEP is that we’re breaking backwards compatibility for TypeGuard.

Specifically a case like this:

def is_int(val: int | str) -> TypeGuard[int]:
    return isinstance(val, int)

def func(val: int | str):
    if is_int(val):
        reveal_type(val)  # "int"
    else:
        reveal_type(val)  # Previously "int | str", now "str"

rchiodo · September 19, 2023, 6:27pm

I think the goal of the PEP is to make writing TypeGuards (and using them) less ambiguous.

Previously a TypeGuard only described what happened when it returned True. But it was unclear what happened on the False case:

def is_int(val: int | str) -> TypeGuard[int]:
    return isinstance(val, int)

def func(val: int | str):
    if is_int(val):
        reveal_type(val)  # "int"
    else:
        reveal_type(val)  # What would you expect here?

I work on Pylance/Pyright and we get a lot of bugs were people expect ‘str’ for the else case above. This PEP is hopefully making TypeGuards just match people’s expectations.

I think that was the original problem that the user had in that gym/spaces/box.py example. They wanted to add typing to it but found using the TypeGuards in numpy confusing.

NeilGirdhar · September 19, 2023, 7:52pm

Really happy to see this!! The narrowing-on-True (limitation 2) behavior in this PEP is something I’ve wanted for a while. It may be worth linking that long and thoughtful discussion in the PEP?

rchiodo · September 19, 2023, 7:59pm

Sounds like a good idea. I’ll submit a change to put it into the post history

NeilGirdhar · September 19, 2023, 8:04pm

It may also be worth motivating this PEP by pointing it that it will solve this issue with dataclasses.is_dataclass. Long story short for those that don’t want to read through the whole thread, is_dataclass has this annotation:

@overload
def is_dataclass(obj: DataclassInstance | type[DataclassInstance]) -> Literal[True]: ...
@overload
def is_dataclass(obj: type) -> TypeGuard[type[DataclassInstance]]: ...
@overload
def is_dataclass(obj: object) -> TypeGuard[DataclassInstance | type[DataclassInstance]]: ...

The first overload is explained here.

With the narrowing behavior, we could eliminate the first overload (as Eric Traut originally recommended here) and type checkers could do the appropriate intersection between DataclassInstance and X’s type.

gandhis1 · September 19, 2023, 8:40pm

There are a couple of concrete examples linked from the Github issue:

Incorrect type narrowing for `inspect.isawaitable` · Issue #15520 · python/mypy · GitHub
Should `asyncio.iscoroutinefunction` return some kind of `TypeGuard`? · Issue #8009 · python/typeshed · GitHub

Another example I have mentioned on the preceding threads was annotating pandas.isna and pandas.notna as a TypeGuard, both of which should be able to narrow in the negative case.

Finally, there is another discussion I don’t think is linked anywhere and that is this one: Type narrowing for TypeGuard in the negative case · python/typing · Discussion #1013 · GitHub

pf_moore · September 19, 2023, 9:45pm

Rich Chiodo:

Previously a TypeGuard only described what happened when it returned True. But it was unclear what happened on the False case:
def is_int(val: int | str) -> TypeGuard[int]:
    return isinstance(val, int)

def func(val: int | str):
    if is_int(val):
        reveal_type(val)  # "int"
    else:
        reveal_type(val)  # What would you expect here?

I would expect the type to be str. And if that’s not the case, my immediate reaction is that TypeGuard is just not doing its job. It shouldn’t need a new PEP to fix this, and in all honesty, I struggle to understand how the existing definition of TypeGuard could imply anything else.

I guess the idea is that the function is_int could actually return False for some integers, and that would still conform to the definition of TypeGuard. But that seems like a pretty bad definition of a function called is_int.

I guess I find all of this incredibly theoretical. That’s something I’ve been frustrated with over typing discussions in the past, and I think now that typing is becoming more mainstream, it’s something that needs to be dealt with.

In an attempt to be more constructive, I’ve tried to read through PEP 724. Please understand my perspective here. I have never used type guards myself, but I’m concerned that I may encounter them in code I support, or am asked to review, and I need to understand them in order to do a proper job. More broadly, I want to understand typing so that I can decide how and when to use it in my code - in the same way that I know async, or XML, to a level that lets me understand its applicability and how to use it if I need to.

The abstract of PEP 724 makes sense to me. I don’t know much about TypeGuard, so I read the Python documentation to get an overview of it. I can see why it has the potential to be useful, although I’d rather hope that it’s only needed in rare cases when type checkers can’t work out the types for themselves. My position here is that if I have to do a lot of work to help the type checker, that’s a bad trade, because I end up just as likely to introduce bugs in complex type definitions as I am to introduce bugs in the code I’m trying to protect with the type checks. I’d be more likely to leave type annotations over-broad than use complicated constructs I don’t fully understand.

The motivation section is more of a struggle. I got sidetracked into reading the linked PEP 647, and that has its own issues for me. It mentions “Limitation 1” of PEP 724 (the one about returning False) but doesn’t, as far as I can tell, justify not requiring “both positive and negative tests” in the same way that builtin tests do. So I’m left not understanding why it needs a new PEP to correct that. And a far as I can tell, “Limitation 2” is precisely the rejected item “Conditionally Applying TypeGuard Type”.

From PEP 724:

PEP 647 imposed these limitations so it could support use cases where the return TypeGuard type was not a subtype of the input type. Refer to PEP 647 for examples.

Maybe I don’t understand well enough, but I couldn’t clearly establish what those examples were or how they justified the limitations - but equally, I couldn’t see anything in PEP 724 that explained why those examples were no longer compelling.

Moving on to the specification section, this makes sense to me, with the one exception that the term “is consistent” is only explained via a link to yet another PEP that involves lots of words and technicalities, but which is frankly pretty impenetrable to an “interested bystander” like me. I can go with an intuitive idea of what it means, but in doing that I lose any sense of why all of this isn’t “obvious” and even needs a PEP to define. But overall, I’m fine with the specification. It’s not your fault that there isn’t a better definition of “is consistent” that you could link to.

The backwards compatibility section is interesting. I was particularly struck by the statement “Type checkers often improve narrowing logic or fix existing bugs in such logic, so users of static typing will be used to this type of behavioral change”. To me that says “type checkers don’t really have good backward compatibility behaviour anyway, so it’s not important that this proposal does either”. Yes, I’m being unkind here, but I think we need to consider when we can start applying the normal criteria for backward compatibility to typing proposals. As more Python users become interested in typing, and therefore start to take more of an interest in typing proposals and PEPs, a message of “normal backward compatibility rules don’t apply to us” isn’t really going to be acceptable for much longer. The testing done for this PEP actually demonstrates a much better compatibility in this case than the above statement suggests, so the problem may be mainly one of attitude and presentation rather than actual compatibility, but it’s something that probably should be addressed by the typing community in general.

On a positive note, “How to teach this” is awesome - “we’ve made it work like everyone expected anyway” is a great message. The only sour note is the implied question “why didn’t you get it right the first time?”

Actually, thinking further, I’d like it if “how to teach this” made some comment about how the Python stdlib documentation would be updated to reflect the new semantics. One frustration I have with typing is that it’s often hard to find out where to look for information - being explicit in new PEPs about “where would users find this information” would be a great improvement.

Overall, I think the PEP is a worthwhile improvement to the behaviour of TypeGuard. I don’t have any objection to it on a technical level. But I do think there’s a number of presentational issues and a bunch of assumed knowledge that makes the PEP unnecessarily inaccessible to non-specialists^[1]. I think you’d get better (and broader) feedback if those were addressed. But you may think it’s more work than is needed for a relatively small PEP like this - that’s your call.

Sorry - I suspect this isn’t exactly what you would have wanted in terms of feedback on the PEP. There’s a lot of general stuff that’s only peripherally related to the specific proposal here. But I hope it’s useful in the broader sens of improving typing discussions in general, and making them more accessible to the average Python developer.

That’s likely common to many typing PEPs, I’m not trying to single this PEP out as particularly bad in this regard. ↩︎

sirosen · September 20, 2023, 7:17am

I think the backwards compatibility contract for typing is nuanced and complex.

Typing PEPs are an incomplete specification for how a type checker should behave. There are gaps and undefined cases – sometimes rather fundamental ones, like how to infer the type of a variable in relatively simple cases. So various implementations have differing behaviors.

The relationship of typing to mypy is very much like packaging to setuptools (or maybe pip, but I think setuptools is more apt because there are mainstream alternatives). If a packaging PEP defines previously undefined behavior, such that it conflicts with current setuptools behavior, is it backwards incompatible?
I ask that question genuinely. My understanding of the situation is that it depends on how impactful the change is in practice. Whatever the answer, the typing disposition should probably match the packaging one, IMO.

This PEP closes a gap by specifying previously unspecified behavior, if I’ve understood correctly.

As for how the behavior might matter, consider this (currently good) guard:

def is_even_int(x) -> TypeGuard[int]:
    return isinstance(x, int) and x % 2 == 0

Which leads to my major question about this PEP. Is there some way to be confident that guards like the above are very rare? To me, it looks like a very reasonable guard to write. But if narrowing occurs on the False branch, it’s wrong on counts on fingers close to half of all integers.

pf_moore · September 20, 2023, 8:16am

Yes, 100%. And I say that as packaging PEP delegate.

It’s fine to break backwards compatibility, but an acknowledgement that this is what you are doing, a transition plan, and a clear explanation of how to handle legacy data is essential. For packaging proposals at least, and I agree that typing PEPs should probably follow a similar model.

Jelle · September 20, 2023, 12:35pm

This is not correct; in fact it changes the behavior for an area that is already specified precisely (in PEP 647), but where we’ve convinced ourselves that the specified behavior is not useful.

This is also my major worry with the PEP. What we have done is use the GitHub - hauntsaninja/mypy_primer: Run mypy and pyright over millions of lines of code tool to look for code in selected open-source projects that would change in behavior with the PEP’s new semantics. (This is mentioned in the PEP.) This check found no cases where the PEP caused new errors.

Of course, that’s just some projects, and there is a lot more Python code out there. I would welcome suggestions for how to better handle the incompatible change.

The PEP does acknowledge that it breaks compatibility. It could perhaps do more to set out a transition plan, but that’s largely going to be up to individual type checkers. For the most part, users will simply see a slightly different type inference behavior when they upgrade their type checker, which is a fairly routine effect of upgrading your type checker.

pf_moore · September 20, 2023, 1:17pm

Drawing from my experience with packaging, that doesn’t mean the PEP can’t describe that transition. We have a behaviour A in PEP 647, and a behaviour B in PEP 724. Type checkers must implement behaviour A currently in order to be compliant. They are now going to have to implement behaviour B instead. While we could say “get there however you please”, that’s not typically how PEPs (packaging PEPs in particular) have worked - the route to take (including things like how to warn the user of the behaviour change, how long the transition should be, etc) should be part of the PEP as it’s a big part of the whole experience for users.

I’m not 100% clear I understand the details of the incompatibility. I assume the fundamental problem is that with

def is_even_int(x) -> TypeGuard[int]:
    return isinstance(x, int) and x % 2 == 0

if is_even_int returns False, the type checker currently can’t assume anything new about x, whereas under PEP 724 it’s allowed to assume x isn’t an int?

If that is the issue, then this use of TypeGuard is now wrong, and will have to be replaced by a simple bool. Is that correct? If so, then I think the normal proces would be to require this type of usage to trigger a deprecation warning for “a period” (usually a release or two, but as we can’t know how long type checker release cycles are, I’d suggest a minimum time period, maybe 6 months?). I’d also say that the PEP could state that type checkers MAY offer an option to restore the old behaviour for a period of time after the new behaviour becomes the default, but that such an option must be removed after a (further) transition period.

And yes, this will be a PITA to users who rely on is_even_int implying that x is int in the True case - but that’s precisely the choice the PEP is making, to say that such usage isn’t important enough to warrant annotation support (as opposed to the “two-way” narrowing, which the PEP is replacing it with). So being explicit about the impact is not only acceptable, it’s honestly pretty much required (IMO).

Jelle · September 20, 2023, 1:38pm

Paul Moore:

I’m not 100% clear I understand the details of the incompatibility. I assume the fundamental problem is that with
def is_even_int(x) -> TypeGuard[int]:
    return isinstance(x, int) and x % 2 == 0
if is_even_int returns False, the type checker currently can’t assume anything new about x, whereas under PEP 724 it’s allowed to assume x isn’t an int?

That’s not quite right. The incompatibility comes up in a case like this:

def some_func(x: int | str):
    if is_even_int(x):
        # it's an int (under both 647 and 724)
    else:
        # int | str under 647, str under 724

pf_moore · September 20, 2023, 2:56pm

Sorry, I thought that’s what I was saying (although I worded it clumsily because I wanted to describe it in text, like the documentation would, rather than as a code example that only shows one specific case).

Regardless, isn’t the implication the same - that is_even_int is incorrectly typed under PEP 724 because it allows things that are int to return True?

I’m actually starting to think that the real answer here is that typing is now very much a first class part of the Python language and stdlib, and as such it has to accept the stability implications involved in that. I think that because the semantics are described in the stdlib documentation, and TypeGuard is part of a stdlib module (typing), it’s unreasonable to change things so that the behaviour of TypeGuard in (say) Python 3.12 fails to match the documentation of the typing module in Python 3.12.

Luckily, the stdlib docs for TypeGuard are careful not to make any statement about what happens when a type guard returns False. So PEP 724 is, in at least that sense, compatible with the current behaviour. However, the implication here is clearly that PEP 724 intends to change the documentation to modify the part that says:

Using -> TypeGuard tells the static type checker that for a given function:

The return value is a boolean.

If the return value is True, the type of its argument is the type inside TypeGuard.

If there’s no change made to these statements, PEP 647 still applies. So the question becomes, in which Python version will this documentation change be made, and how will the change be rolled out in such a way that the Python stdlib compatibility rules are followed?

I don’t have good answers here - packaging standards are deliberately documented outside of the Python core documentation so that we have control of their status and change control policies. Typing went a different route, and has a stdlib module and docs in the core. That choice, for better or worse, had consequences which are starting to become more problematic as typing becomes mainstream.

Look at it this way, I guess. Which is worse - having users raise bug reports on mypy/pyright because their is_even_int function no longer works even though the stdlib docs say it should, or having to get the SC to agree to a backward compatibility exception for PEP 724, to allow it to make a retroactive change to the TypeGuard docs, effective all the way back to Python 3.10?

Maybe the SC is still OK with letting typing do its own thing here. I don’t know their views - I’m just a Python user, who is interested in, but not a heavy user of, typing. My reluctance to use typing more than I currently do is precisely because of this sort of uncertainty about what’s stable and what could change under me with little notice. So I’m probably making a bigger meal of this than maybe I should.

Sorry if I’ve hijacked this PEP discussion to make a point of my personal peeve. Hopefully what I’ve said is useful at least in the broader context, but I’ll try to leave it at this. I think anything further I might add would just be repeating myself.

ntessore · September 20, 2023, 3:12pm

Would it not in any case make more sense to introduce a binary alternative (e.g. IsType) instead of modifying the existing TypeGuard? The few times I have implemented a TypeGuard, I have specifically depended on the existing behaviour.

Jelle · September 20, 2023, 3:26pm

Could you give concrete examples of code where you relied on the existing behavior?