PEP 742: Narrowing types with TypeIs

:100:

I think it’s highly desirable to name this after existing runtime functionality to that effect;

def is_str(x: object) -> IsInstance[str]:
    return isinstance(x, str)

is much more self-explanatory, and reuses existing naming/concepts that people are familiar with.

Also, this would leave the door open – as previously for typing.List -> list, typing.Tuple -> tuple, etc. – to eventually teach Python to do the typing-side of things on top of the regular syntax, i.e. enable writing

# in the future?
def f(x: str | int) -> None:
    if isinstance(x, str):  # inferred to be the same as `is_str` above
        assert_type(x, str)
    else:
        assert_type(x, int)
4 Likes

I have a question about the semantics.

  • type NP should be narrowed to A∧R
  • type NN should be narrowed to A∧R

This makes sense to me. However, then the logic of reducing (Awaitable[int] | int) ∧ Awaitable[Any] to Awaitable[int] is unsound.

It is pefectly fine for a class to inherit int and be Awaitable[Any] at the same time, without being Awaitable[int]

Now the PEP does specify that the behaviour should match that of isinstance and it seems that most common type checkers behave the same with isinstance(x, Awaitable), but it does feel weird to me to have an example in the PEP that is technically unsound.

2 Likes

In this case it doesn’t matter, because Awaitable[Any] will already be accepted for Awaitable[int], so you get the same amount of unsoundness either way. If you want to proof unsoundness you will have to provide an example that doesn’t use Any and is still unsound. Of course you can construct unsound examples in the way the PEP describes (i.e. only return True for a subset of the possible values for that type instead of the entire value space)

Good point. I posted a PR to the PEP to clarify that type checkers may infer the more precise intersection type:

When we write the conformance test suite, we should do it in a careful way that doesn’t close the door to type checkers implementing a more precise interpretation of the intersection type.

I just asked the Typing Council for a decision. Perhaps the biggest sticking point remains the name, but none of the other proposed names seem clearly better to me, and @carljm’s comment approving of TypeIs is still the most liked comment in the thread, so I’m keeping TypeIs. However, the Typing Council can recommend an alternative name instead.

1 Like

If TypeIs is chosen as the eventual name, would it be possible to specify (broadly) in the PEP why the typing council felt that concerns about TypeIs as a name are minor? This thread indicated objections to TypeCheck and TypeIs due to their ambiguity with other phrases/vocabulary and idioms commonly used in Python, and neither of the objections are resolved; in contrast, I don’t see any objections against TypeNarrow[er] or IsInstance, which are also both popular choices.

I think it’s especially worthy to emphasise the result of choosing the name TypeIs to compare between the two type-narrowing guards we’ll have. The following will be idiomatic, minimal examples of type-narrowing functions, where the guard unambiguously performs the correct runtime checks and fulfills the appropriate expectations:

from typing import TypeGuard, TypeIs

def is_x_an_integer_1(x: int | str) -> TypeGuard[int]:
    return type(x) is int

def is_x_an_integer_2(x: int | str) -> TypeIs[int]:
    return isinstance(x, int)

class Int(int): ...

x: int | str = Int()
if not is_x_an_integer_1(x):
    reveal_type(x)  # int | str
if not is_x_an_integer_2(x):
    reveal_type(x)  # str

We can put this example in the context of the PEP’s rationale, where it is stated that

We believe that users are more likely to want the behavior of TypeIs, the new form proposed in this PEP, and therefore we recommend that documentation emphasize TypeIs over TypeGuard as a more commonly applicable tool.

Note that this mirrors the runtime advice for choosing isinstance(x, ...) over type(x) is ... as a more commonly applicable way to check whether the type of an object is a .... So, why are users more likely to want TypeIs over TypeGuard? Confusingly, it is because users are more likely to want the analogous type-narrowing behaviour of isinstance(x, int) over type(x) is int.

EDIT: isinstance vs type(...) is is a rather popular point of confusion at runtime (see e.g. What's the canonical way to check for type in Python? - Stack Overflow) - clarifying the static typing analogy of them correct would be great.

6 Likes

As one of the people that clicked like on that comment, I don’t like that being interpreted as a vote for TypeIs.

I clicked like on that comment because I liked the reasoning addressing the previous objections.

But then other good objections were also raised (and that does nothing to negate the fact that carljm’s reasoning was good).

as @xmw pointed out, it’s more important to look at what objections there are and whether they have been addressed.

2 Likes

I do object to TypeNarrow(er) – I think it is too awkward and jargon-y, and doesn’t clearly look “simpler” than TypeGuard.

I don’t find the objections raised against TypeIs strongly persuasive. In my experience, it is very common when discussing Python types to say something like “the type of x is int,” where the obvious meaning is the same as x: int, that is “x is an instance of int or a subtype.” I’ve never seen this wording lead someone to believe that the type of x must be exactly int; a concept that isn’t even representable in Python type annotations.

But if the name question is reopened, I also don’t have any objection to IsInstance. There is a nice parallel there. I think the annotation of isinstance in typeshed could literally become something like def isinstance[T](obj: object, typ: T) -> IsInstance[T]: ...

5 Likes

One I hadn’t seen suggested before (but alludes Guido’s preference for TypeCheck) — is CheckType

it’s short… a verb… and hints that it is used with functions that return booleans at runtime.
(IsInstance and TypeIs are both good too)

I noticed this when glancing over PEP 742 (emphasis mine):

When a TypeGuard function returns True, type checkers narrow the type of the variable to exactly the TypeGuard type.

This was a succinct way to describe TypeGuard (for me, it was rather illuminating, and clarified the entire intention of PEP 742 and any existing confusions about how TypeGuard behaved), and was the inspiration behind the type(x) is int guard function example. So in fact, there is a way to annotate exact types, as appropriate for a check which involves type identity - by using TypeGuard.

In my experience, it’s more common for to hear x is an int rather than the type of x is int. The difference is that the former is generic OOP phrasing (cats are animals, or a cat is an animal, vs. the type of cat is animal), while the latter corresponds to a program in some language which can be run, and can the program can churn out results which do not reflect intention (as with the class Int(int) example). Different programming languages may even have different meanings for is/Is and ==[=] compared with Python.

2 Likes

I deleted my previous post because I didn’t want to make this thread too noisy with points that have already been discussed in some way. That said, I’m not sure one of the points I raised has been adequately addressed to date.

The rationalization for this name is what it does in the positive case; however the key distinction between TypeIs and TypeGuard is what happens in the negative case, and it’s not clear to me that the name TypeIs sufficiently covers that. TypeGuard[int] likewise says that a particular value is an int or a sub-type, but it does nothing in the negative case.

Given that the current PEP was a deliberate, explicit departure from the original PEP which sought to modify the behavior of TypeGuard (which remains my personal preference, despite the community moving in another direction) the concern stands that the name does not clearly convey what the difference is versus TypeGuard.

1 Like

This is a mis-reading of the quoted text from PEP 742. The word “exactly” does not have the meaning in that sentence that you are ascribing to it. This sentence, which immediately follows PEP 742’s first use of “exactly” in describing TypeGuard, clarifies the intended meaning:

They cannot use pre-existing knowledge about the type of the variable.

When narrowing after a successful TypeGuard[int] check, the type checker cannot use its pre-existing knowledge that the variable was known to be e.g. of type MyInt (a subclass of int); the type-checker must give it type int from that point, even if that is actually a widening of the known type. But int here still means “int or subtype,” just as it does everywhere else in the type system, it doesn’t mean “must be exactly int and not a subtype.” There’s no suggestion anywhere in PEP 647 of the latter reading. If it did mean “exactly int only”, TypeGuard would only be correct to use on functions that perform a type identity check (which are rightly very rare), and every example given in PEP 647 itself would be a wrong usage, as none of those examples do a type identity check; all are fine with subtypes.

IMO this should not be a goal for the naming of the new special form. The differences with TypeGuard are several and subtle, far too much so to capture in a good name. And the name TypeGuard doesn’t help us out by being in any way specific to its behavior, relative to the new form.

If we attempt to capture the difference in the name, it will lead to an overly-long, overly-specific name, leading people to assume that TypeGuard is probably the thing they want, and the very-specifically-named thing is probably for an edge case. Rather, the goal should be to pick a name that looks more general than TypeGuard, give it precedence over TypeGuard in the docs, and leave the docs to describe the specific behavior differences.

If we had a time machine, we would avoid the creation of TypeGuard, or give it the more-specific niche name instead. But that ship has sailed.

1 Like

The Typing Council has voted in favor of PEP 742 in its current form. All TC members support the TypeIs name. It has been a bit of a struggle to get here but we are now ready to recommend PEP 742 to the Steering Council, in preference of PEP 724.

(As you may remember, PEP 724 proposed to change the existing TypeGuard special form (introduced by PEP 647) to implement more desirable semantics for type guards, creating some edge cases where the semantics would change for existing code (causing type checkers to occasionally produce new errors). There was all round agreement that the new semantics are in most cases better, but a majority of the TC found the backward incompatibility in edge cases unacceptable. Therefore Jelle drafted PEP 742, which proposes a new name for the new semantics (without proposing to deprecate TypeGuard, as its different semantics are occasionally useful). There was considerable bikeshedding on the name of the new special form, but we ultimately landed and agreed on TypeIs. Details of the breaking edge cases are in PEP 724’s Backwards Compatibility section.)

20 Likes

The SC has accepted this PEP!

17 Likes