Problems with TypeIs

Liz · June 10, 2024, 5:46pm

Some recent discussions about type negation have brought up problems. I don’t know if rehashing all of them would be appropriate or not here, but TypeIs allows negations that are unsafe, while it was added to be a type safe version of typeguard which negation was supposed to be safe on.

@mikeshardmind brought up what was theoretically safe in the typing thread about adding Not on github

Here’s an example I came up with when discussing a different issue that I later noticed would have been prevented if this had been explored prior to accepting TypeIs, as the issue with structural type negations were noted
Code sample in pyright playground

import asyncio
from typing import Sequence
from typing_extensions import TypeIs

class A:  pass
class B(A):
    def __init__(self, use_count: int = 0):
        self.use_count = use_count

def is_seq_b(x: Sequence) -> TypeIs[Sequence[B]]:
    return all(isinstance(i, B) for i in x)

async def takes_seq(x: Sequence[A]):
    if is_seq_b(x):
        await asyncio.sleep(10)
        users = sum(i.use_count for i in x)

async def takes_list(x: list[A]):
    t = asyncio.create_task(takes_seq(x))
    await asyncio.sleep(1)
    x.append(A())
    await t  # so that the error shows up if someone copy pastes this.


if __name__ == "__main__":
    lst: list[A] = [B(i) for i in range(10)]
    asyncio.run(takes_list(lst))

example modified to show that takes_list is callable without error, thanks @Sachaa-Thanasius

This has already been accepted for 3.13 and the feature freeze has happened already, what’s the process to have this fixed?

mikeshardmind · June 10, 2024, 6:07pm

The negation problem isn’t clearly shown by this example, I did recently edit the referenced post to have a negation example with structural types. Structural types don’t play nicely with TypeIs at all here though, this also has a bad interaction with use of a structural type to erase variance (that this example does show), something which is common with list vs Sequence with how people write code.

stroxler · June 10, 2024, 11:54pm

I think this example is interesting, but:

(a) as far as I can tell the problem is identical if we replace TypeIs with TypeGuard so I don’t think it’s clear that the specific example is connected to Not[...].

(b) it seems to me this is a Type{Guard,Is}-specific special case of a more general problem with refinement of mutable data that isn’t specific to type guard operators

For (b), the Pyre team has often debated whether to accept this code:

class X:
    y: int | None
    def __init__(self):
        self.x = 42


def f(x: X):
    if isinstance(x.y, int):
        g()  # call some other function
        y = x.y + 1

because we cannot prove at the type level that there is no way to reference x and mutate x.y from g, for example

global_x: X = X()

def g():
    x.y = None

f(global_x)

Issues become even more likely with async, but at the root of the problem is that the type system offers no mechanism to prove that just because my current reference to some mutable data is read-only, there exists no other reference that is read-only.

As a result, Pyre currently will erase the refinement when it sees the call to g(), or if it sees any await expressions whatsoever.

With free-threaded (no GIL) Python coming, the problem becomes even more significant - it’s actually not sound to refine potentially mutable data for even a single operation in the presence of free-executing threads, because your refinement could be invalidated instantly by some other thread with a writeable reference to that data.

mikeshardmind · June 11, 2024, 12:04am

While this is the root problem, the type system also is supposed to calculate variance, and variance is discarded in this case. If variance wasn’t discarded, this example wouldn’t work.

This example fails to type check with MutableSequence, and that failure point is at TypeIs (Specifically, it doesn’t narrow). It’s clear to me that there isn’t a way to achieve the incorrect narrowing without TypeIs allowing more than it should here. You can view this here: Pyright Playground, but the error is a little useless in the current state. You can observe it’s the same inferred type as other failures with disallowed use of TypeIs.

This isn’t actually new with free threading. Any pure python statement boundary is a candidate for a threading context switch currently (there are more besides this)

mikeshardmind · June 11, 2024, 12:08am

The example is also more dangerous than standard mutable vs non mutable containers, as this type checks in a way that allows an incompatible type to have been inserted in a way that passes type checks, not just mutation while iterating or something.

Liz · June 11, 2024, 12:29am

Not and protocols doesn’t work because assignment to structural types is lossy. TypeIs and TypeGuard should not allow protocols for the same reason. The lost infomormation is important to variance, variance is important to subtyping calculation. MutableSequence was pointed out to work (As in type checking work, and error here), but that’s a coincidence of having the same variance. We also can’t special case Sequence to not allow this, user defined structural types could have similar issues.

stroxler · June 11, 2024, 12:38am

Okay yeah this jives with what I was thinking.

I would guess that Pyright is actually not doing anything very special about protocols or variance here, it probably is just observing that MutableSequence[A] and MutableSequence[B] are disjoint, and deciding that it’s better to ignore the refinement than try to use it.

MyPy will (somewhat defensibly, although Pyright’s behavior seems better) happily tell you that in the MutableSequence version, the type of x is Never because any sound implementation of the the TypeIs actually has to return False uniformly and the code is unreachable:

My guess is that Pyright is just doing the same calculation and then falling back to more useful behavior.

Assuming I’m right about why the MutableSequence example behaves as it does in Pyright, it’s not going to be easy to make the Sequence example behave the same way.

It might be hard to get agreement about where exactly the problem is though:

My initial reaction is that the problem is really allowing refinements that effectively apply to nested potentially-mutable data; if I have a variable of type Sequence[A], it could still have a mutable handle, and any refinements on contained data are suspect (refining x[0] for a Sequence x is, in my mind, not very different from refining x.y in my y: int | None example)
Another potential argument we might run into is that TypeIs and TypeGuard are power features whose soundness isn’t guaranteed and that it’s the responsibility of authors to not implement an unsound refinement. I think this may actually be more or less what the PEPs currently suggest.

mikeshardmind · June 11, 2024, 12:53am

TypeIs was marketed as a typesafe replacement for TypeGuard where if the check was exhaustive, it was safe. Including in the pep itself. TypeGuard was already in a banned API lint for that reason, I’d hate to need to add the supposedly type-safe alternative as well.

I explored this back in 2023 for Not. and came up with a list of things where it is/isn’t safe. The original post actually includes a direct link to it, but structural types can remain safe for this kind of narrowing if and only if type checkers have to ensure total consistent use throughout the entire code graph. pyre and pytype both have models that could support that, but mypy and pyright do not. In the absence of that being a requirement, I think structural types need to be banned here.

Jelle · June 11, 2024, 4:03am

Good find! It is possible to construct similar examples with many other type-narrowing behaviors:

import asyncio

class A:
    def __init__(self, use_count: int = 0) -> None:
        self.use_count = use_count

async def takes_seq(x: list[A | None]) -> None:
    if x[0] is not None:
        await asyncio.sleep(10)
        print(x[0].use_count)  # boom

async def takes_list(x: list[A | None]) -> None:
    t = asyncio.create_task(takes_seq(x))
    await asyncio.sleep(1)
    x[0] = None
    await t  # so that the error shows up if someone copy pastes this.


if __name__ == "__main__":
    lst: list[A | None] = [A(1)]
    asyncio.run(takes_list(lst))

If there is a way to fix this without throwing away a lot of practically useful type narrowing behavior, I would encourage type checkers to implement it. If not, though, I think the best way forward is to document this soundness hole and let users judge whether it is of practical significance to them.

We’re still in the beta phase and we could revise TypeIs in some way, including removing the feature entirely. I would oppose the last option; I don’t think this soundness hole seriously detracts from the usefulness of the feature. I doubt anything else we do would change the runtime implementation of TypeIs. If we want to make some tweaks to how type checkers should implement TypeIs, that can be done through the standard procedure for updating the spec.

From a type checker’s perspective, there are no structural types involved here; Sequence is not a protocol, and to type checkers it is a nominal superclass of list.

Liz · June 11, 2024, 5:23am

Can you help me find where that would be? Runtime actually doesn’t register list to sequence: cpython/Lib/_collections_abc.py at 6efe3460693c4f39de198a64cebeeee8b1d4e8b6 · python/cpython · GitHub and the typeshed just imports this directly. The data model also states that sequences are distinguished by mutability, and directly avoids calling mutable sequences just sequences 3. Data model — Python 3.12.4 documentation

This appears to be a special case in type checkers that may be incorrect then. Even so, this does effect structural typing even if it also effects a special case. I’ve modified the example above to show that what @mikeshardmind and I both said about special casing Sequence being insufficient here is accurate. you’re welcome to come up with other protocols of your own.

Code sample in pyright playground

import asyncio
from typing import Protocol, TypeVar, Iterator
from typing_extensions import TypeIs

T = TypeVar("T", covariant=True)

class Sequence(Protocol[T]):
    def __iter__(self) -> Iterator[T]:
        ...

class A:  pass
class B(A):
    def __init__(self, use_count: int = 0):
        self.use_count = use_count

def is_seq_b(x: Sequence) -> TypeIs[Sequence[B]]:
    return all(isinstance(i, B) for i in x)

async def takes_seq(x: Sequence[A]):
    if is_seq_b(x):
        await asyncio.sleep(10)
        users = sum(i.use_count for i in x)

async def takes_list(x: list[A]):
    t = asyncio.create_task(takes_seq(x))
    await asyncio.sleep(1)
    x.append(A())
    await t  # so that the error shows up if someone copy pastes this.


if __name__ == "__main__":
    lst: list[A] = [B(i) for i in range(10)]
    asyncio.run(takes_list(lst))

Additionally, pyright actually catches your example with list, that’s part of how this issue was found: `TypeIs` narrowing for certain parameterized containers doesn't make sense · Issue #8109 · microsoft/pyright · GitHub

This needs something to change. It’s unsafe to use as-is and as is documented. I don’t know what the most appropriate change would be, but this bled into intersections a bit, I’d defer to @mikeshardmind on the actual impact to that.

Without considering the impact on intersections since we can paint out of any box we can paint into, I’d ask if there’s a good reason to allow structural types to be narrowed with TypeIs. I can’t think of one, so maybe we should just review for classes of types that aren’t safe to narrow in this manner and ban them.

Jelle · June 11, 2024, 5:28am

In typeshed. In builtins.pyi, list inherits from MutableSequence, which is ultimately defined in typing.pyi and inherits from Sequence.

Does it? I see no errors: Pyright Playground

Liz · June 11, 2024, 5:39am

Sorry, you’re right that it doesn’t, the caught case with list was closer to the original. I do think these are meaningfully different cases. I think your case is solvable by saying that to narrow something by index isn’t safe and you have to have a reference to the exact object you are narrowing. This is different.

Okay, so for all intents and purposes, it’s a problematic but convenient lie in the typeshed causing this problem. I find myself exasperated by how many issues I’ve run into, either myself or through others, that can be described as such.

mikeshardmind · June 11, 2024, 5:53am

I haven’t deeply explored the impact of it, but I had already ruled out protocols and gradual types from being negatable safely, with a very big asterisk on protocols, and an out for gradual types that isn’t safe, but would be consistent with the gradual guarantee.

The selling point for TypeIs over TypeGuard was supposed to be that it was safe to narrow with, both in the positive and the negative. I don’t think it’s living up to it’s purpose on that, and I’d rather remove it than document a soundness hole where there wasn’t supposed to be one.

The fact that this isn’t limited to Sequence, which is an ABC in the standard library and could be special-cased, but also impacts user-defined structural types, including protocols and TypedDicts, leads me to believe there are three options here.

Ban Structural types from being used here
Remove the feature
Require type checkers consider total consistent use throughout the entire visible code graph, not just at and upto function boundaries.

options 2 and 3 require no other changes, option 1 requires that Sequence is also special cased here to not be allowed to narrow.

This PEP proposes a new special form, TypeIs , to allow annotating functions that can be used to narrow the type of a value, similar to the builtin isinstance(). Unlike the existing typing.TypeGuard special form, TypeIs can narrow the type in both the if and else branches of a conditional.

Writing a safe TypeIs function

A TypeIs function allows you to override your type checker’s type narrowing behavior. This is a powerful tool, but it can be dangerous because an incorrectly written TypeIs function can lead to unsound type checking, and type checkers cannot detect such errors.

For a function returning TypeIs[T] to be safe, it must return True if and only if the argument is compatible with type T, and False otherwise. If this condition is not met, the type checker may infer incorrect types.

The examples above break the documented case for what is safe, and this was the selling point of TypeIs.

mdrissi · June 11, 2024, 7:28am

I don’t really follow emphasis on TypeIs in this discussion given there are examples already here that do not involve TypeIs and involve simpler isinstance/other narrowing rules. In general type narrowing is unsafe in the presence of mutations in separate context. That’s true both for TypeIs, isinstance, and many native type narrowing rules here. TypeIs I view as intended to offer safety comparable to other native type narrowing rules while TypeGuard allows additional unsafety.

I also don’t see a good way to allow useful type narrowing if async/free threaded code that may mutate same variable elsewhere causes narrowing to be considered unsafe.

mikeshardmind · June 11, 2024, 9:06am

The differences have already been covered here already, but to put why a focus on TypeIs matters here succinctly:

TypeIs was added to be the type-safe alternative to TypeGuard
TypeIs claims to be type safe in this case
TypeIs allows an unsafe narrowing in this case
This same exact unsafe narrowing can’t be replicated without TypeIs

Either the documentation of what is safe needs updating or the unsafety needs to be fixed or caught. If we simply document this, that removes the motivation for this having even been accepted.

If you’d like additional reasons, this allows a different kind of unsafety than many others that can’t be fixed with the below…

The other cases that don’t come from TypeIs with concurrent access are solvable as a separate issue, which was also pointed out separately, see:

In otherwords, you can’t safely narrow a value in a list by index, or dict by key, you can assign a local variable from the value of that object, and narrow that.

Liz · June 11, 2024, 12:09pm

narrowing from the result of isinstance(x, T) is never incorrect. Narrowing from TypeIs that is only composed of all, iterating, and isinstance is incorrect currently. There’s a level of difference in unsafety here

There’s also the linked post that nobody has addressed yet. Negation is unsound on several categories of types. TypeIs does not restrict these and claims to be sound to negate. I can’t copy the full original and retain all the formatting, references, and an image showing a specific formulation from an academic paper on the matter. TypeIs is more unsound than just the easily demonstrated problems.

stroxler · June 11, 2024, 4:52pm

Narrowing from TypeIs that is only composed of all, iterating, and isinstance is incorrect currently.

I agree that the implementation looks reasonable, and so it would be nice if a type checker warned you that your guard is almost certainly going to be unsound.

I still think it’s not clear this is showing that the type system is broken, since the implementation of the TypeIs, while reasonable, is incorrect (for example with identical reasoning, I could try to narrow an empty sequence to Sequence[Never] which is more obviously incorrect - the underlying issue is the same).

And it’s not clearly showing that TypeIs is worse than TypeGuard. In this example the bug is identical using TypeGuard, so negation isn’t playing much of a role here, even the positive narrowing is not correct.

There’s also the linked post that nobody has addressed yet.

I’m am interested in the idea that Not is unsound on a variety of types, and how we think the TypeIs operator should behave in these cases.

But I’m not succeeding in figuring out exactly what the claim and evidence is.

I can see a linked paper that says Not is invalid on gradual types, which makes sense to me (if we were to adopt the notion that Any is basically an unbound type variable representing any one static type, it’s pretty clear you can’t take the complement of that).

I don’t see an argument explaining why it cannot be applied to static types. I do see a claim that it doesn’t work with an example

type MutableSequence[T] = Sequence[T] & SequenceMutationMethods
type NeverMutableSequence[T] = Sequence[T] & ~SequenceMutationMethods
x: list = []
y: Sequence = x
z: NeverMutableSequence = y
x.append(1)

but I’m not sure what this example is demonstrating. I am unclear why a type checker should accept that y is a valid instance of NeverMutableSequence, that seems like a type error to me because I need “evidence” in the type checker that y is immutable, and I don’t have that (which is a good thing because it is very much mutable!)

As a result, I don’t see why this example is demonstrating a problem with Not and structural types.

I’m not trying to argue here that Not is okay for structural types, just communicating that I am not finding an argument I can follow about the issue.

I do think that the TypeIs example shows that Not might actually be tricky in the presence of generic types, although the problem seems more related to type narrowing than to Not itself - in fact it must be an issue with narrowing rather than Not, because the problem is exactly the same using TypeGuard which has no negative case.

So I still don’t feel like I have a clear grasp of the implications here.

I’m still trying to follow also the claim that TypeIs is supposed to be more type-safe than TypeGuard. It is a narrowing tool with similar behavior, and it is actually more powerful which means it is strictly harder to implement correctly (any valid TypeIs is a valid TypeGuard but not vice-versa), I don’t think any existing docs contradict this.

From the PEP, I think this line is technically correct but a little problematic:

For a function returning TypeIs[T] to be safe, it must return True if and only if the argument is compatible with type T , and False otherwise. If this condition is not met, the type checker may infer incorrect types.

It is technically correct because the argument is a python value with live references, and meeting this requirement means your type guard needs to be able to guarantee that the type narrowing is valid across all references. If that’s provably impossible for the type you are narrowing, then your type guard is provably unsound (and this is true for both TypeGuard and TypeIs).

I do think it’s a little misleading because people tend to substitute that “the argument value could be consistent with the type” which is a different claim and is only usable when all data is immutable (so not very often in Python).

In the case of a covariant type where we are narrowing the type parameter, it is probably provably unsound in every scenario if soundness includes being sound across all possible subclasses, since I can always subclass with an invariant (as we do for Sequence → MutableSequence).

@mikeshardmind

TypeIs claims to be type safe in this case

I’m pretty sure this function does not meet the stated requirements to be sound for TypeIs (or TypeGuard for that matter). So I’m not seeing this.

I definitely agree it looks reasonable, but type narrowing is tricky and in fact this is just not valid.

This same exact unsafe narrowing can’t be replicated without TypeIs

It’s true that the negative case gives the type checker even more room to do something weird here, but since the positive case is already unsound I still don’t understand why we think this is a TypeIs-specific problem.

Sorry for the long post with lots of questions.

I know folks in the intersection examples group have thought a lot about the type system and you probably have insights I don’t yet

I’m trying to learn, and I’m having a hard time following.

It seems like there are very precise claims being made based on examples where it isn’t clear that the root cause of the issues we see aren’t something related but different (e.g. this example is broken without negation, and without any structural types, so it’s hard to see how it could be demonstrating that negation doesn’t work on structural types).

I’m not trying to say that the claims themselves are wrong, just that I’m having trouble seeing the argument and evidence for the claims. The problem may be partly that I’m not very smart or knowledgable about type theory, but I think my background is actually pretty typical of Python typing folks.

mikeshardmind · June 11, 2024, 5:37pm

Here’s the actual function in a vacuum:

def is_seq_b(x: Sequence) -> TypeIs[Sequence[B]]:
    return all(isinstance(i, B) for i in x)

it only returns True when x is consistent with the type Sequence[B] and always does so in that case, it always returns False if the type is not consistent with Sequence[B] it also only takes a sequence, therefore the variance involved is covariant, not invariant. This function is as specified, a safe typeguard. This holds true when replaced with structural protocols as well.

The lack of the type system keeping track of variance across function boundaries and throughout the entire function flow makes it impossible for this to be safe, as there is nowhere to detect that something that was once invariant had it’s variance erased. That’s also the problem of structural types. information can be erased across function boundaries.

I apologize, but I haven’t worked through a more complex example of the negation issue in practical terms. When Liz brought up the simpler issue with protocols on the positive side, I thought arguing the more concrete case would be easier than working up a hypothetical case involving type variables, intersections, and type negation. The paper linked is concerned with set-theoretic types in a compiled language (where the whole code graph is visible to a compiler) The pep for TypeIs also specifically says that this should approximate the theoretic types that aren’t yet denotable, so I think referencing the academic literature is appropriate here.

Some of the paper’s conclusions about what is and isn’t safe are very hard to reason about in an accessible manner when the paper is essentially concerned with a future we’re working towards, not the system we have now, and some of it’s assumptions have to be re-reasoned about, such as compiled with full view of the entire application’s code graph vs the fact that some existing python type checkers don’t have a means to do that.

It doesn’t, but in all the discussion with people trying to find ways that this somehow wasn’t an issue with TypeIs, the discussion veered away from that. The linked post touches on it, but the example here is a separate unsafe issue with TypeIs, I said this in my first response , though I don’t blame anyone for losing track of that, there’s a lot to digest here.

I don’t either, and that’s actually what bothers me most about this I can’t predict how much of a problem this will actually be, yet the problems I am aware of seem very easy for novice typing users to fall into accidentally. They are the type of things that Jelle even went so far as to describe as “useful”. All sorts of answers online tell people to just lie about variance and use sequence to bypass problems with it. This worked safely enough before TypeIs because you could never further narrow the generic type of the Sequence and have a violation.

TypeIs was meant to be a typesafe narrowing tool, but it’s allowing things in negation we know aren’t safe from theory (gradual types) and it’s allowing things in even the positive case that are described to be type safe, that can be shown that they aren’t. While it does say “misuse” of TypeIs can lead to wrong types, I would argue that the function above is within the description of a valid TypeIs function.

I’m concerned that people just want to keep it because they like what they see as a replacement for the much-maligned TypeGuard, but that it’s going to have just as many flaws, while having far further reaching impacts because the pep also defines that negation is now part of the type system and doesn’t restrict that negation to what we know to be theoretically sound.

mikeshardmind · June 11, 2024, 5:48pm

History starts around here, while there were other scattered prior discussions, this is where it started to centralize, and the reason it started out with the name “Stricter TypeGuard” was very much in line with actually requiring it to be type safe narrowing: PEP 724: Stricter Type Guards

stroxler · June 11, 2024, 6:18pm

Thanks for the clarifications, it seems I’m less out of step with what you’re saying than I thought.

I think this is maybe a the heart of the more immediate question about whether this is a bug in the existing type system or just a case where it isn’t obvious that the TypeIs implementation is unsound.

What exactly does it mean for x to be consistent with Sequence[B]?

It could mean that the value of x “right now” is consistent with Sequence[B]
But I think in the type system it actually means the reference x, which represents a value that can change over time, is consistent with Sequence[B], which it is in fact not.
In some sense the heart of the issue is that __iter__ is not deterministic and so you can’t soundly narrow based on it’s output. But that’s actually true of any method in Python, the type system can never ensure determinism which makes any type narrowing using methods iffy.

I’m very much in agreement with the idea that ideally a type checker ideally should be able to reason about at least some cases where we know a type narrowing is unsound and produce an error.

Unsafe narrowing can happen with both TypeGuard and TypeIs, although I agree there are presumably more cases where TypeIs is unsound, since negative narrowing is making strictly more assumptions.

To implement such warnings, it would be pretty helpful to know about cases where type narrowing is never sound, and a possibly bigger set of cases where negative narrowing in particular isn’t. I’m pretty sure we can make progress on this over time, and I also don’t think it breaks anything in the existing spec to add such warnings.

I’m still fuzzy on exactly how a type checker would ideally figure out that this specific case is unsound:

I do think the underlying reasoning would be related to variance and generics
I don’t think it is specifically related to Protocols, since there is no protocol here
I do think it might hint at a generic type behaving, in some sense, like a structural type
- this may be where the disconnect we’ve had is: Sequence[A] is not a protocol, but is it in some sense behaving like a “structural type” due to generic behavior and type erasure?

The type checker could then choose how to behave downstream:

it could refuse to apply the type guard (as Pyright seems to already when it ignores narrowing if there’s a variance mismatch)
or it could silently accept it on the grounds that we allow users to explicitly request unsound behavior when useful (as MyPy seems to when it narrows to Never given a variance mismatch)

I don’t think any of this contradicts the existing spec. In these cases TypeIs / TypeGuard is already unsound, we just don’t have the tooling (and we maybe don’t yet fully have even the theory to build such tooling) warning people about it. This isn’t ideal because it’s pretty easy implement a guard that looks reasonable but is unsound.