PEP 724: Stricter Type Guards

PEP 724 - Stricter Type Guards was introduced to try and make TypeGuards align closer to how isinstance behaves.

Feedback appreciated

9 Likes

In the interest of a productive discussion, are there any areas of this PEP that you might eagerly highlight as having some semblance of controversy or ambiguity? Everything I have read to date across the PEP PR, typing repo discussions, etc. has been fairly consensus in nature.

Can you provide links, or ideally a summary, of previous discussions? Are they on Discourse or the mailing list and I missed them?

The PEP has a link to previous discussions but hereā€™s some links:

Introduction of the idea:
Mailman 3 Type narrowing for TypeGuard in the negative case - Typing-sig - python.org

More discussion:

Thanks. It all seems a bit abstract to me. I donā€™t have any objections as such (my use of typing isnā€™t even close to this level of complexity), but the lack of any real world examples of how this would be useful in actual code bothers me a little. This isnā€™t specifically about this PEP, it seems like a general tone with typing discussions that Iā€™ve encountered - they focus a lot more on theory and abstract examples than ā€œwhat real-world usages will this enable?ā€

The example code given in the github issue (gym/spaces/box.py) doesnā€™t even have type annotations, so Iā€™m confused as to why itā€™s relevant. Surely a PEP like this should be making existing, somewhat clumsy and/or unnecessarily broad annotations a little more accurate, not making the difference between having annotations or not annotating types at all?

2 Likes

Weā€™re meeting today actually for a typing sig meetup. I hope thereā€™s a place to put the slides? Iā€™ll have to ask around.

My guess is the most controversial part of the PEP is that weā€™re breaking backwards compatibility for TypeGuard.

Specifically a case like this:

def is_int(val: int | str) -> TypeGuard[int]:
    return isinstance(val, int)

def func(val: int | str):
    if is_int(val):
        reveal_type(val)  # "int"
    else:
        reveal_type(val)  # Previously "int | str", now "str"
1 Like

I think the goal of the PEP is to make writing TypeGuards (and using them) less ambiguous.

Previously a TypeGuard only described what happened when it returned True. But it was unclear what happened on the False case:

def is_int(val: int | str) -> TypeGuard[int]:
    return isinstance(val, int)

def func(val: int | str):
    if is_int(val):
        reveal_type(val)  # "int"
    else:
        reveal_type(val)  # What would you expect here?

I work on Pylance/Pyright and we get a lot of bugs were people expect ā€˜strā€™ for the else case above. This PEP is hopefully making TypeGuards just match peopleā€™s expectations.

I think that was the original problem that the user had in that gym/spaces/box.py example. They wanted to add typing to it but found using the TypeGuards in numpy confusing.

3 Likes

Really happy to see this!! The narrowing-on-True (limitation 2) behavior in this PEP is something Iā€™ve wanted for a while. It may be worth linking that long and thoughtful discussion in the PEP?

1 Like

Sounds like a good idea. Iā€™ll submit a change to put it into the post history

It may also be worth motivating this PEP by pointing it that it will solve this issue with dataclasses.is_dataclass. Long story short for those that donā€™t want to read through the whole thread, is_dataclass has this annotation:

@overload
def is_dataclass(obj: DataclassInstance | type[DataclassInstance]) -> Literal[True]: ...
@overload
def is_dataclass(obj: type) -> TypeGuard[type[DataclassInstance]]: ...
@overload
def is_dataclass(obj: object) -> TypeGuard[DataclassInstance | type[DataclassInstance]]: ...

The first overload is explained here.

With the narrowing behavior, we could eliminate the first overload (as Eric Traut originally recommended here) and type checkers could do the appropriate intersection between DataclassInstance and Xā€™s type.

2 Likes

There are a couple of concrete examples linked from the Github issue:

Another example I have mentioned on the preceding threads was annotating pandas.isna and pandas.notna as a TypeGuard, both of which should be able to narrow in the negative case.

Finally, there is another discussion I donā€™t think is linked anywhere and that is this one: Type narrowing for TypeGuard in the negative case Ā· python/typing Ā· Discussion #1013 Ā· GitHub

1 Like

I would expect the type to be str. And if thatā€™s not the case, my immediate reaction is that TypeGuard is just not doing its job. It shouldnā€™t need a new PEP to fix this, and in all honesty, I struggle to understand how the existing definition of TypeGuard could imply anything else.

I guess the idea is that the function is_int could actually return False for some integers, and that would still conform to the definition of TypeGuard. But that seems like a pretty bad definition of a function called is_int.

I guess I find all of this incredibly theoretical. Thatā€™s something Iā€™ve been frustrated with over typing discussions in the past, and I think now that typing is becoming more mainstream, itā€™s something that needs to be dealt with.

In an attempt to be more constructive, Iā€™ve tried to read through PEP 724. Please understand my perspective here. I have never used type guards myself, but Iā€™m concerned that I may encounter them in code I support, or am asked to review, and I need to understand them in order to do a proper job. More broadly, I want to understand typing so that I can decide how and when to use it in my code - in the same way that I know async, or XML, to a level that lets me understand its applicability and how to use it if I need to.

The abstract of PEP 724 makes sense to me. I donā€™t know much about TypeGuard, so I read the Python documentation to get an overview of it. I can see why it has the potential to be useful, although Iā€™d rather hope that itā€™s only needed in rare cases when type checkers canā€™t work out the types for themselves. My position here is that if I have to do a lot of work to help the type checker, thatā€™s a bad trade, because I end up just as likely to introduce bugs in complex type definitions as I am to introduce bugs in the code Iā€™m trying to protect with the type checks. Iā€™d be more likely to leave type annotations over-broad than use complicated constructs I donā€™t fully understand.

The motivation section is more of a struggle. I got sidetracked into reading the linked PEP 647, and that has its own issues for me. It mentions ā€œLimitation 1ā€ of PEP 724 (the one about returning False) but doesnā€™t, as far as I can tell, justify not requiring ā€œboth positive and negative testsā€ in the same way that builtin tests do. So Iā€™m left not understanding why it needs a new PEP to correct that. And a far as I can tell, ā€œLimitation 2ā€ is precisely the rejected item ā€œConditionally Applying TypeGuard Typeā€.

From PEP 724:

PEP 647 imposed these limitations so it could support use cases where the return TypeGuard type was not a subtype of the input type. Refer to PEP 647 for examples.

Maybe I donā€™t understand well enough, but I couldnā€™t clearly establish what those examples were or how they justified the limitations - but equally, I couldnā€™t see anything in PEP 724 that explained why those examples were no longer compelling.

Moving on to the specification section, this makes sense to me, with the one exception that the term ā€œis consistentā€ is only explained via a link to yet another PEP that involves lots of words and technicalities, but which is frankly pretty impenetrable to an ā€œinterested bystanderā€ like me. I can go with an intuitive idea of what it means, but in doing that I lose any sense of why all of this isnā€™t ā€œobviousā€ and even needs a PEP to define. But overall, Iā€™m fine with the specification. Itā€™s not your fault that there isnā€™t a better definition of ā€œis consistentā€ that you could link to.

The backwards compatibility section is interesting. I was particularly struck by the statement ā€œType checkers often improve narrowing logic or fix existing bugs in such logic, so users of static typing will be used to this type of behavioral changeā€. To me that says ā€œtype checkers donā€™t really have good backward compatibility behaviour anyway, so itā€™s not important that this proposal does eitherā€. Yes, Iā€™m being unkind here, but I think we need to consider when we can start applying the normal criteria for backward compatibility to typing proposals. As more Python users become interested in typing, and therefore start to take more of an interest in typing proposals and PEPs, a message of ā€œnormal backward compatibility rules donā€™t apply to usā€ isnā€™t really going to be acceptable for much longer. The testing done for this PEP actually demonstrates a much better compatibility in this case than the above statement suggests, so the problem may be mainly one of attitude and presentation rather than actual compatibility, but itā€™s something that probably should be addressed by the typing community in general.

On a positive note, ā€œHow to teach thisā€ is awesome - ā€œweā€™ve made it work like everyone expected anywayā€ is a great message. The only sour note is the implied question ā€œwhy didnā€™t you get it right the first time?ā€

Actually, thinking further, Iā€™d like it if ā€œhow to teach thisā€ made some comment about how the Python stdlib documentation would be updated to reflect the new semantics. One frustration I have with typing is that itā€™s often hard to find out where to look for information - being explicit in new PEPs about ā€œwhere would users find this informationā€ would be a great improvement.

Overall, I think the PEP is a worthwhile improvement to the behaviour of TypeGuard. I donā€™t have any objection to it on a technical level. But I do think thereā€™s a number of presentational issues and a bunch of assumed knowledge that makes the PEP unnecessarily inaccessible to non-specialists[1]. I think youā€™d get better (and broader) feedback if those were addressed. But you may think itā€™s more work than is needed for a relatively small PEP like this - thatā€™s your call.

Sorry - I suspect this isnā€™t exactly what you would have wanted in terms of feedback on the PEP. Thereā€™s a lot of general stuff thatā€™s only peripherally related to the specific proposal here. But I hope itā€™s useful in the broader sens of improving typing discussions in general, and making them more accessible to the average Python developer.


  1. Thatā€™s likely common to many typing PEPs, Iā€™m not trying to single this PEP out as particularly bad in this regard. ā†©ļøŽ

3 Likes

I think the backwards compatibility contract for typing is nuanced and complex.

Typing PEPs are an incomplete specification for how a type checker should behave. There are gaps and undefined cases ā€“ sometimes rather fundamental ones, like how to infer the type of a variable in relatively simple cases. So various implementations have differing behaviors.

The relationship of typing to mypy is very much like packaging to setuptools (or maybe pip, but I think setuptools is more apt because there are mainstream alternatives). If a packaging PEP defines previously undefined behavior, such that it conflicts with current setuptools behavior, is it backwards incompatible?
I ask that question genuinely. My understanding of the situation is that it depends on how impactful the change is in practice. Whatever the answer, the typing disposition should probably match the packaging one, IMO.

This PEP closes a gap by specifying previously unspecified behavior, if Iā€™ve understood correctly.

As for how the behavior might matter, consider this (currently good) guard:

def is_even_int(x) -> TypeGuard[int]:
    return isinstance(x, int) and x % 2 == 0

Which leads to my major question about this PEP. Is there some way to be confident that guards like the above are very rare? To me, it looks like a very reasonable guard to write. But if narrowing occurs on the False branch, itā€™s wrong on counts on fingers close to half of all integers. :wink:

1 Like

Yes, 100%. And I say that as packaging PEP delegate.

Itā€™s fine to break backwards compatibility, but an acknowledgement that this is what you are doing, a transition plan, and a clear explanation of how to handle legacy data is essential. For packaging proposals at least, and I agree that typing PEPs should probably follow a similar model.

3 Likes

This is not correct; in fact it changes the behavior for an area that is already specified precisely (in PEP 647), but where weā€™ve convinced ourselves that the specified behavior is not useful.

This is also my major worry with the PEP. What we have done is use the GitHub - hauntsaninja/mypy_primer: Run mypy and pyright over millions of lines of code tool to look for code in selected open-source projects that would change in behavior with the PEPā€™s new semantics. (This is mentioned in the PEP.) This check found no cases where the PEP caused new errors.

Of course, thatā€™s just some projects, and there is a lot more Python code out there. I would welcome suggestions for how to better handle the incompatible change.

The PEP does acknowledge that it breaks compatibility. It could perhaps do more to set out a transition plan, but thatā€™s largely going to be up to individual type checkers. For the most part, users will simply see a slightly different type inference behavior when they upgrade their type checker, which is a fairly routine effect of upgrading your type checker.

4 Likes

Drawing from my experience with packaging, that doesnā€™t mean the PEP canā€™t describe that transition. We have a behaviour A in PEP 647, and a behaviour B in PEP 724. Type checkers must implement behaviour A currently in order to be compliant. They are now going to have to implement behaviour B instead. While we could say ā€œget there however you pleaseā€, thatā€™s not typically how PEPs (packaging PEPs in particular) have worked - the route to take (including things like how to warn the user of the behaviour change, how long the transition should be, etc) should be part of the PEP as itā€™s a big part of the whole experience for users.

Iā€™m not 100% clear I understand the details of the incompatibility. I assume the fundamental problem is that with

def is_even_int(x) -> TypeGuard[int]:
    return isinstance(x, int) and x % 2 == 0

if is_even_int returns False, the type checker currently canā€™t assume anything new about x, whereas under PEP 724 itā€™s allowed to assume x isnā€™t an int?

If that is the issue, then this use of TypeGuard is now wrong, and will have to be replaced by a simple bool. Is that correct? If so, then I think the normal proces would be to require this type of usage to trigger a deprecation warning for ā€œa periodā€ (usually a release or two, but as we canā€™t know how long type checker release cycles are, Iā€™d suggest a minimum time period, maybe 6 months?). Iā€™d also say that the PEP could state that type checkers MAY offer an option to restore the old behaviour for a period of time after the new behaviour becomes the default, but that such an option must be removed after a (further) transition period.

And yes, this will be a PITA to users who rely on is_even_int implying that x is int in the True case - but thatā€™s precisely the choice the PEP is making, to say that such usage isnā€™t important enough to warrant annotation support (as opposed to the ā€œtwo-wayā€ narrowing, which the PEP is replacing it with). So being explicit about the impact is not only acceptable, itā€™s honestly pretty much required (IMO).

Thatā€™s not quite right. The incompatibility comes up in a case like this:

def some_func(x: int | str):
    if is_even_int(x):
        # it's an int (under both 647 and 724)
    else:
        # int | str under 647, str under 724
1 Like

Sorry, I thought thatā€™s what I was saying (although I worded it clumsily because I wanted to describe it in text, like the documentation would, rather than as a code example that only shows one specific case).

Regardless, isnā€™t the implication the same - that is_even_int is incorrectly typed under PEP 724 because it allows things that are int to return True?

Iā€™m actually starting to think that the real answer here is that typing is now very much a first class part of the Python language and stdlib, and as such it has to accept the stability implications involved in that. I think that because the semantics are described in the stdlib documentation, and TypeGuard is part of a stdlib module (typing), itā€™s unreasonable to change things so that the behaviour of TypeGuard in (say) Python 3.12 fails to match the documentation of the typing module in Python 3.12.

Luckily, the stdlib docs for TypeGuard are careful not to make any statement about what happens when a type guard returns False. So PEP 724 is, in at least that sense, compatible with the current behaviour. However, the implication here is clearly that PEP 724 intends to change the documentation to modify the part that says:

Using -> TypeGuard tells the static type checker that for a given function:

  1. The return value is a boolean.
  2. If the return value is True, the type of its argument is the type inside TypeGuard.

If thereā€™s no change made to these statements, PEP 647 still applies. So the question becomes, in which Python version will this documentation change be made, and how will the change be rolled out in such a way that the Python stdlib compatibility rules are followed?

I donā€™t have good answers here - packaging standards are deliberately documented outside of the Python core documentation so that we have control of their status and change control policies. Typing went a different route, and has a stdlib module and docs in the core. That choice, for better or worse, had consequences which are starting to become more problematic as typing becomes mainstream.

Look at it this way, I guess. Which is worse - having users raise bug reports on mypy/pyright because their is_even_int function no longer works even though the stdlib docs say it should, or having to get the SC to agree to a backward compatibility exception for PEP 724, to allow it to make a retroactive change to the TypeGuard docs, effective all the way back to Python 3.10?

Maybe the SC is still OK with letting typing do its own thing here. I donā€™t know their views - Iā€™m just a Python user, who is interested in, but not a heavy user of, typing. My reluctance to use typing more than I currently do is precisely because of this sort of uncertainty about whatā€™s stable and what could change under me with little notice. So Iā€™m probably making a bigger meal of this than maybe I should.

Sorry if Iā€™ve hijacked this PEP discussion to make a point of my personal peeve. Hopefully what Iā€™ve said is useful at least in the broader context, but Iā€™ll try to leave it at this. I think anything further I might add would just be repeating myself.

3 Likes

Would it not in any case make more sense to introduce a binary alternative (e.g. IsType) instead of modifying the existing TypeGuard? The few times I have implemented a TypeGuard, I have specifically depended on the existing behaviour.

Could you give concrete examples of code where you relied on the existing behavior?