PEP 724: Stricter Type Guards

I have a different viewpoint. A few people have objected to changing the behavior of TypeGuard, but I find their arguments unconvincing. The objections mostly come from those who admit to not using static type checking. This feature is targeted at users of static type checkers, and I think we should make a decision based on what’s best for the target audience. Developers who rarely use static typing don’t appreciate how frequently bug fixes and improvements in type checkers result in “breaking changes”. Users of type checkers generally accept and appreciate improvements and readily adjust their code bases accordingly.

I don’t think we should be cavalier about making changes to the typing standard, but I also don’t think we should use the same backward compatibility standards as runtime behavior. The current PEP makes a well-reasoned argument for making this change.

My opinion is that any pain incurred by a change in TypeGuard behavior will be minor (affecting very few users) compared to the significant pain and confusion that will result from adding another construct like StrictTypeGuard. Let’s take that into consideration when making this decision.

Type checkers can mitigate any minor backward compatibility issues by offering a configuration switch that preserves the older behavior. This will give developers an opportunity to choose when they want to make any potential changes to their code bases if required. In the vast majority of cases, developers will see no difference.

22 Likes

As a user of type checking I completely agree with this argument.

Backwards compatibility is not a goal in itself, as I understand it backwards compatibility is meant to allow users to easily upgrade to a new version without too many breaking changes. It’s about setting expectations and provide ease of use for users.

As long as the name isn’t changed (and therefore there is no real runtime change) the only upgrade path to consider is that of upgrading your type checker. When upgrading your type checker a user should currently already expect breaking changes. That’s the status quo already, and I’m glad it is because that means we constantly get better type checkers that warn about issues they previously didn’t see. I don’t see how this PEP changes the current status quo.

If we want to start enforcing the same backwards compatibility policy to the typing behaviour as to the runtime behaviour somebody should first check whether this is really what users want. From my personal experience and talking with colleagues I have not seen anybody that would think that is better.

Of course it is annoying that a type checker upgrade is a little bit more involved, but that’s just a side effect of getting all the improvements of an upgrade.

5 Likes

I’m going to pick at this a little, although I agree with almost all of the surrounding context.

This is not what type checker changes look like for libraries. For a library, you basically put your annotations out into the world and cross your fingers – you don’t control the type checker version or flags at all.

This echoes a lot of what’s in this thread – not a ton of super strong objections to the PEP change itself, but a lot of subthreads pushing people to broaden their perception of impact when talking about typing changes.

I think several of us jumped in on the naming enthusiastically in part because Jelle as the PEP sponsor seemed open to it. But, also worth noting, he didn’t suggest going with StrictTypeGuard. I’ll presume, though I don’t want to put words in his mouth, that he sees the same kind of future confusion arising from that name that you do. I don’t or didn’t see it as so bad, but having another type checker maintainer saying that it’s a major issue is rapidly convincing me that I’ve underrated how many users are likely to be confused.

Mehdi earlier suggested adding LaxTypeGuard for the current behavior and changing the meaning of TypeGuard. Perhaps that didn’t get enough time to be discussed as a possibility. It solves one of the issues which I think are central here: If you have code which relies on the current TypeGuard behavior including the false branch subtlety, what should you do?

Discussion of that “user story” – someone who actually wants the current behavior – has been interwoven with other discussion of backwards compatibility. Maybe it’s considered too niche to be worth addressing? But there should be some, at least cursory, consideration before making that judgement.

That’s fine, but there’s been a lot of pushback this time because it’s hard to tell what the standard or contract here is, and this touches something already documented in a “more stable” place (stdlib). It’s not so much bending the rules as raising the question of what the rules are.

2 Likes

Fair enough, but the proposed changes to type guard are stricter than the old type guard—both in the positive and the negative case. So a user who refuses to upgrade their type checker would simply not get the benefits of the new type guard, but shouldn’t typically see new errors.

1 Like

FWIW, I’d like for TypeGuard to change its behaviour, it’s not unprecedented territory for a PEP to change stuff previously defined in a different PEP, and TypeGuard is new enough that the churn should be minimal. IMO right now most people using TypeGuard are power users who would be fine with updating any annotations if needed.

1 Like

The draft PEP doesn’t propose to change all TypeGuard behavior from “lax narrowing behavior” to “strict narrowing behavior”. It proposes to apply strict narrowing behavior when it’s safe to do so, and it spells out the conditions under which strict behavior can be safely applied. If this isn’t clear from the current PEP, let us know and we’ll work on clarifying this point.

I also want to underscore the fact that the proposed change has zero impact at runtime. It affects only type checking behavior. Changes that affect runtime behavior should have a much higher bar. We should not apply this same bar to features of the type system.

@sirosen, you said “This is not what type checker changes look like for libraries.” For context, are you a maintainer of one or more libraries? If so, do you use TypeGuard in your library interfaces (or are you familiar with any libraries that do)? If so, would the semantic meaning change under the draft PEP in a way that would negatively impact the consumers of your library? I’m just trying to get a sense for whether your concern is purely theoretical or if you have a more grounded reason for concern. I contend that the number of published libraries whose public interfaces would be affected by this change is zero. If we can find a single counterexample, it would be useful to know about it. I’d be very surprised if we could.

The compatibility impact of the draft PEP for users of type checkers will be less than many bug fixes I make every week in pyright. As an example, here’s a bug report logged today that involves the isinstance type narrowing behavior in pyright. The fix for this bug will likely have more compatibility impact to pyright users than the proposed change in this PEP. Changes of this sort happen very frequently in type checkers.

Likewise, the mypy bug database currently has >1300 active bugs, and many of these, when fixed, will have a greater impact on backward compatibility than the change proposed in this PEP. Some of these bug fixes will affect library interfaces as well.

To be clear, I’m not expressing a concern about the name of an alternate form of TypeGuard. My concern is about the addition of a second form of TypeGuard.

For comparison purposes, let’s consider two different TypeGuard proposals and how they might roll out over time.

Proposal 1 is described in the current draft PEP, where the existing TypeGuard form becomes “smarter” and applies strict narrowing behaviors when it’s safe to do so.

  • Jan 2024: PEP is ratified
  • Jan 2024: A new version of pyright is released that defaults to the new behavior. No changes are required in typeshed stubs. No changes are required in any published libraries. A small percentage of pyright users notice minor differences in their type checking results, and they either modify their code or temporarily make use of a configuration option in pyright that retains the old behavior of TypeGuard.
  • July 2024: A new version of mypy incorporates the changes. Mypy users now enjoy the new functionality. Some small percentage of mypy users notice minor differences and either modify their code or use a backward-compatibility switch.
  • Long term, there is only one TypeGuard form, and it applies strict narrowing behaviors when it’s safe to do so and “lax” narrowing behaviors when it’s not.

Proposal 2 introduces a new TypeGuard form with strict narrowing behavior, and the current TypeGuard behavior remains unchanged.

  • Jan 2024: PEP is ratified
  • Jan 2024: A new version of pyright is released that includes support for the new form. It doesn’t provide any benefit to users of the existing TypeGuard form (whether they use it directly in their code or via type guard functions exported by typeshed or other libraries).
  • Mar 2024: Pyright users learn about the new form and start to complain to maintainers of typeshed and other libraries and request updates to existing stubs to incorporate the new form. They are told “we can’t do that until all major type checkers support it”.
  • July 2024: A new version of mypy incorporates the changes, but mypy users don’t gain any benefit from it unless they know about the new form and switch to it.
  • Aug 2024: Mypy users start to complain to maintainers of typeshed and other libraries. They’re likewise told “we can’t incorporate the new form that until all major type checkers support it; we need to wait for pyre and pytype”.
  • Jan 2025: Pyre and pytype release new versions that support the new form.
  • Jan 2025: Typeshed updates the stdlib stubs to use the new form, and users start to benefit from the new functionality.
  • Feb 2025: A new version of pyright is introduced that incorporates the updated typeshed stubs. A small percentage of pyright users notice minor differences in their type checking results and adjust their code accordingly.
  • Apr 2025: A new version of mypy is introduced that incorporates the updated typeshed stubs. A small percentage of mypy users notice minor differences in their type checking results and adjust their code accordingly.
  • Throughout 2025 and into 2026: Other library authors who use the old TypeGuard form in their libraries release updates that use the new form. As these trickle in, some users see minor differences in their type checking results and adjust their code accordingly.
  • Long term, both forms of TypeGuard remain, and users need to be educated about the difference between the two.

Proposal 2 creates more pain and inconvenience for everyone involved, it delays the benefits provided by this PEP for one to two years, and it creates confusion in the long term. I don’t see how proposal 2 can be the preferable answer here.

12 Likes

I do maintain a handful of libraries. Most are public pypi things, and a few are internal things at my workplace.

None of them use TypeGuard in their interfaces; only internally. And I’m not aware of any libraries which use TypeGuard in that way. So my concern is theoretical – I cannot point to an example.

Researching my own codebases and the codebases we have at work, including applications, I found things which would be unaffected by this PEP or which are already of questionable validity.

I’m not sure that leads me to conclude that there are zero cases or even “likely zero”. If the general feeling is that the burden of proof is on me to find an example, that’s fine. I’ll try to set aside some time to do some research.

I’ll also need to look at the PEP again to try to understand the point you’re making about only applying strict narrowing in specific cases. I’m not sure I understood it.

This is not what type checker changes look like for libraries. For a library, you basically put your annotations out into the world and cross your fingers – you don’t control the type checker version or flags at all.

This is true of course and something to consider indeed. I also maintain some libraries and can see how this could lead to problems where a “downstream” package uses a type checker that can’t recognise the concepts/behaviour as used by an “upstream” dependency of it.

However, to me this argument is of slightly less importance as the type checker of the “downstream” package already isn’t perfect, because, as stated many times before, the current type checkers all have their flaws and incompatibilities. Incompatibility between the type checker (and thus the typing) of package an and package b are already a possibility. This PEP does not change the status quo.

I understand that people want A more useful and less divisive future for typing?, but I don’t think we should arbitrarily apply a stricter interpretation of breaking changes to one particular PEP. It’s clear that Proposal 1 as mentioned by Eric is much better for the future users of Python and we shouldn’t let some theoretical discussion with very few actual real world issues prevent that future from materialising.

5 Likes

Since this has come up a few times in this thread, here’s an example of a large project (100k lines of Python, though only partly type annotated) upgrading their type checker. In this case from MyPy 1.5.1 to 1.6 (about 2 months apart).

Three changes were necessary:

  • Updating the CI,
  • Adding one type annotation that was previously ambiguous,
  • Removing one type: ignore that is no longer necessary.

There are also multiple examples where a similar bump took more time. I faced this recently at work.

I think you could even make the argument that the fact that many projects pin their type checker version explicitly shows that they expect breaking changes to occur between versions. Of course there are other reasons to do this, but this is certainly a consideration many projects make from my experience.

Three changes were necessary

That seems an entirely acceptable level of change.

My concern is that the proposed change would cause a lot more changes for those people that use TypeGuard with it’s current specification.

That’s been measured and included in PEP. We can estimate amount of changes by type checker change through mypy/pyright primer. This checks the type messages produced using master vs pr. The number of errors reported/changed based on primer with this PEP is smaller then several regular prs that type checkers have today as normal bug fixes/behavior changes without PEP. I can think of occasional intentional type checker behavior changes not covered by pep that created dozens error messages.

It’s even possible to check how many errors this PEP would produce on your codebase today. Pyright already implements it under a flag that’s false by default (it’d become true after PEP). You can test it now by "enableExperimentalFeatures": true in your pyright configuration. Testing it right now on large codebase I work with most of new errors of Unnecessary "# type: ignore" comment where behavior I want is this PEP and have been working around/ignoring. Checking one largeish typed codebase I work with that does use TypeGuard a fair amount for ~30k lines of code error count changed by 9 of which 6 are like Unnecessary "# type: ignore a desirable change. 1 of them looks like minor bug report, and last 2 are me doing something weird that I think new error is correct. So 8 fixes, 1 small bug, and 0 false positive errors.

3 Likes

I’m concerned that using a tool like MyPy Primer to measure how big a change this is might be misleading.

The primer is a good tool for detecting the impact of a change that makes typing rules stricter, because it asks how much code that type checks today doesn’t type check.

But PEP 724 actually makes typing rules less strict, which means that code that shouldn’t type check today starts type checking. So the problems it causes will show up later, if people are using TypeGuards that are unsound under the current semantics. In most cases MyPy Primer wouldn’t catch the problem.

Let me explain this in terms of a little example

Consider this code:

def is_even_int(x: int) -> TypeGuard[int]

def boom_if_not_int(x: int) -> None:
    # something bad, for argument's sake maybe corrupt my database
    ...


def end_user_function(x: int | str):
    if is_even_int(x):
        boom_if_not_int(x: int)  # this is safe today
    if is_even_int(x):
        # I can't do this, or the type checker complains.
        # As a result, the type checker is helping me not break my database!
        # boom_if_not_int(x: int)
        print(x)

Why will mypy primer not complain on this?

This code type checks just fine today:

  • the first block refines x to an int and everything is as intended
  • the type checker verifies that int | str is a subtype of object in the second branch. Yay!

After PEP 724, it still type checks just fine:

  • the first block is exactly the same
  • in the second block, the type checker verifies that int is a subtype of object

Importantly, this is true for every example. PEP 724 will introduce more refinement, meaning the type of x will be “smaller” (a subtype of more things), and therefore any code that typechecks before will type check after.

So any changes in MyPy Primer pretty much have to be unusual edge cases, the only case I can think of off the top of my head is when the guarded type isn’t a subtype of the preexisting type

  • for example I could have guarded on int | bytes, which would cause a “jump” if the type checker doesn’t try to intersect the guarded type with the preexisting type since int | bytes isn’t a subtype of int | str
  • using this kind of type guard is legal, but I would guess not super common

Does that mean my example is safe under the change?

No, shipping PEP 724 could be very dangerous to this code.

The type guard is semantically incorrect under 724. But the type checker has no way of knowing this, so it will go ahead and refine the type of x to str.

This type checks just fine. So far so good, I don’t introduce churn by bumping type checker versions.

But what happens if I uncomment the boom_if_not_int(x) in the else branch? It still type checks just fine, but I’m going to corrupt my database!

Since the whole reason I wanted typing was to avoid this kind of bug, I’m going to be an unhappy user.

Conclusion

Recapping my intro: PEP 724 is making type checkers less strict, not more strict (it’s confusing because it says “stricter type guards”, but that’s because it’s making the guards themselves stricter, which are contravarient in the guarded type, so it’s making the type checks less strict, not more strict).

But the MyPy Primer is only going to do a good job at measuring issues caused by changes that make type checkers stricter. It will tend to show very few problems when we make type checkers less strict.

The problem is not helped by pinning type checker versions, because you aren’t going to notice the problem (that your type guards are now unsound) when you bump versions, you’ll notice it later when you ship a bug that the type checker should have caught.

To be clear, I’m not saying we shouldn’t accept this PEP (I don’t have a firm opinion at this point), I just think that MyPy Primer results aren’t going to give us a clear picture.

There’s at least a case to be made that in this scenario we should always make a new construct, like TypePredicate[X], and then gradually deprecate TypeGuard[X].

1 Like

After a careful reading, I think the PEP authors actually never claimed that existing TypeGuards that are unsound today would be caught by the MyPy Primer run, the two sections in the PEP are separate.

I like this idea (other than questions of backward compatibility) and the writing is very clear.

I just wanted to call out explicitly in this discussion thread that if we’re worried about breaking code that makes use of prexisting TypeGuards which are unsound under the new spec, the Primer results aren’t going to tell us anything.

Agreed. I’m finding myself resent more and more forced code refactoring, often to a more verbose and not necessarily more explicit style, just to please mypy. Yes, I can add type: ignores but there’s a threshold where the existence of those starts to look like a code smell when it wouldn’t do so without.

1 Like

I thought I’d try and assess how much of a problem the backward incompatibility is in practice. As stroxler mentions, mypy_primer is limited in its ability to assess whether switching to PEP 724 creates soundness holes. With that in mind, I went through the first ten projects in Code search results · GitHub

(bonus, since it was mentioned in this thread)

Summary:

I was hoping for clarity, but I didn’t really get it. I did find real-world type guard functions where 724 semantics introduces potential for unsoundness where there previously wasn’t any. At the same time, none of these cases seem like particularly bad breakage — they weren’t public APIs and from what I could tell didn’t seem to be used in ways that the holes opened really mattered (although there’s survivorship bias when looking for buggy code).

4 Likes

The Typing Council was unable to come to unanimous agreement on PEP 724. The TC members are in agreement that the “strict semantics” described in PEP 724 are useful and address feedback from developers about the original TypeGuard (introduced in PEP 647). There was disagreement about whether to modify the existing TypeGuard semantics (as proposed in PEP 724) or introduce a new special form that offers different semantics from the original TypeGuard. The majority of TC members are concerned about the backward compatibility implications of modifying the existing TypeGuard semantics and therefore recommend an approach that differs from the current PEP’s proposal. The authors of the PEP have consequently decided to withdraw PEP 724. This opens the door for alternative proposals to be put forth.

1 Like

I posted a draft of an alternative proposal, which will be PEP 742, at PEP 742: TypeNarrower by JelleZijlstra · Pull Request #3649 · python/peps · GitHub. A rendered draft is available at PEP 742 – Narrowing types with TypeNarrower | peps.python.org.

This proposal implements the suggestion that came up in this thread, which is to create a new special form with the “strict” semantics specified in PEP 724. I chose the name “TypeNarrower”, but I’m open to using a different name if a better contender comes up.

Early feedback on the draft is appreciated. Within the next few days, I’ll merge the PR and create a dedicated new discussion thread.

I am proposing this PEP by myself, not on behalf of the rest of the Typing Council.

5 Likes

I like the new name versus the other previously proposed alternatives as it is fairly symmetrical with the existing TypeGuard. At the same time, I do reiterate that this divergence will now make the terminology of Python inconsistent with TypeScript. Users who come into Python expecting TypeGuard to behave like TypeScript type guards will perhaps be surprised to find out that TypeNarrower has the behavior they actually want. But if this is what will get the necessary consensus for a highly-useful feature, so be it.

1 Like

For what it’s worth, the TypeScript documentation currently uses the term “type predicate” for these (https://www.typescriptlang.org/docs/handbook/2/narrowing.html#using-type-predicates).

Edit: But they also use “type guard”. I think the function as a whole is supposed to be a type guard, and the expression in the return annotation is a type predicate.