typing.Cast as an annotation alternative to typing.cast

UltimateLobster · November 10, 2022, 7:50pm

typing.cast is very helpful and I find myself using it quite a lot when trying to be type-safe (especially when converting old code).
However, it always bothered me that the only way to cast types is during runtime by calling an empty function which hurts performance for the sake of type-checkers that do everything else statically using annotations.

My suggestion is to add a way to cast types using annotations.
Instead of:

from typing import cast

a = cast(str, b)

We could do:

from typing import Cast
a: Cast[str] = b

While the name Cast could be improved, I think the concept of type casting using annotations would be quite intuitive since every other typing functionality is done using annotations (or some sort of a decorator which has runtime cost only for the initial setup)

Jelle · November 10, 2022, 8:07pm

Your proposed alternative wouldn’t replace all use cases for cast, because it’s frequently used deep inside an expression, not in an assignment. So we’d have to maintain two ways to do the same thing.

It’s going to be rare that a cast() call is really a performance bottleneck: calling a simple Python function is pretty fast in 3.11 and if we believe Mark Shannon it will be even faster in the future. However, if you’re really worried about the performance cost, you can use # type: ignore instead of a cast.

UltimateLobster · November 11, 2022, 10:38am

As I was writing that post I came to a similar conclusion. Having 2 ways to cast types will go against the zen of python. I also had some thoughts but unfortunately they are not too organized in my head right now but I feel I need to lay them down in no particular order.

Just to be sure. While this means supporting 2 ways of casting types, you don’t duplicate any code. The casting function will be implemented by using the Cast type so type checkers would need to add support to just one way of doing it.
While the runtime cost is minimal. The fact that it’s being used for static type checking (which theoretically could be zero cost) is heart breaking. It’s not anything that is critical by any means, but it really feels like there should be an easy solution somewhere.
Perhaps generalizing the problem would be a worth while solution where the development cost would be higher, but the value would also be useful in other situations as well.

a. One way could be to find a way to use annotations in expressions. Something like:
```
from typing import Cast

range(foo as Cast[int])
```
However, this would include a change to the syntax which will programmers will need to learn
and even though other languages (such as typescript) use a similar syntax a big change to the
language that all developers would need to know about for the sake of a feature that has such a
small value in comparison.

b. Your quote of Mark Shannon may suggest a different approach. Maybe we can take all of these
“empty functions” that exist merely for the sake of “marking” an object, and have an optimization
that applies to all of them. Admittedly I’m not familiar too much with the underlying C-API but
perhaps a special object that’s optimized for these specific cases? Maybe specializing the newly
adaptive interpreter to handle such cases by skipping the CALL instruction (or some kind of a
special instruction specific for these kind of objects?).

I have no idea how hard/easy it is to implement such changes but if it all happens behind the
scenes and the programmer wouldn’t know about them it would be easier to maintain and without
worrying about backwards compatibility.

Even if none of these “solutions” are applicable here I still would love to hear your thoughts about them to learn about the approach taken when writing the language.

vbrozik · November 11, 2022, 9:35pm

Cannot the optimization go further so that cast() will not cause any function call at runtime? Like if cast(TypeX, expression) was replaced by (expression).

Note: Should side-effects of evaluating TypeX be guaranteed? I hope the should not

UltimateLobster · November 12, 2022, 5:44am

From my (very-very-limited) understanding it would be a bit more complicated than that. cast can be any arbitrary variable since only at run time is it evaluated to be the function “typing.cast”. If you wanted the optimizations to be on the interpreter side it can be really complicated to “know” that “typing.cast” is the actual function being called. Also, if the optimization is specific for this one function it will be harder to maintain and will most likely spaghettify the code.

This is why in my reply, I tried to have some solution that’s more generic and doesn’t revolve only around “cast” itself.

However, as someone who doesn’t understand Python’s low-level code, I might speaking out of my rear right now.

As for your side-note, it’s really weird if someone relies on side effects for stuff like annotations. It would also be weird to rely on side-effects for something that basically exists exclusively for type-checkers. Should it be guaranteed because of that? I’ll join you to that “I hope not” sentiment.

tungol · January 7, 2024, 3:45am

A couple possible advantages that I can see:

This idea also came up in https://github.com/python/typing/issues/496, where the idea was to minimize how often typing needed to be imported. Rather than typing.Cast, that proposal was to make cast() supported by default in annotation contexts, without adding it as a built-in function. That could be nice, but I don’t know what the expected behavior would be when cast is defined in the surrounding scope, or what the impact on runtime users might be. This also might be confusing for newer python developers.

Performance: right now the performance cost of cast itself is low, but to get there best practice is to always cast to a quoted type: cast("list[int]", x) avoids needing to evaluate the target at runtime, and is faster even for basic builtins like cast("int", x) (see https://github.com/snok/flake8-type-checking/issues/119#issuecomment-1205405504). That’s kind of unfortunate: It’s not intuitive and a bit ugly. Moving cast to the annotation combined with PEP 649 would remove the need to do that.

Another possibly interesting thing is to compare with the safer upcast() and downcast() functions proposed in https://github.com/python/typing/issues/565. If we’re considering one change to cast() it probably makes sense to consider how that would relate to other possible changes to cast(). Any such functions would also benefit, performance-wise, from quoting the type argument, and that probably wouldn’t be possible for the proposed version of downcast() which includes a runtime assert statement; moving it to the annotation after PEP 649 is in place would be the only want to handle that. But again, I don’t know what that would do to runtime annotation users.

Using cast(T) instead of Cast[T] would avoid needing two implementations, but the implementation in that case would probably benefit from acceptance of PEP 661.

mikeshardmind · January 11, 2024, 3:57am

I’m very much in favor of removing as many runtime uses of cast as possible. I’m not sure an annotation works for enough cases though, perhaps a special type comment? type: cast TypeHere This would allow removing a ton of assert isinstance uses in private code I maintain, and some in public that uses this pattern instead when possible due to a (current) lack of support for associated types.

Daverball · January 11, 2024, 8:53am

I would love to have this, this would also avoid having to import cast to begin with. I sometimes avoid using cast, just because it’s quicker to type type: ignore.

But it’s also still not a silver bullet, since it forces you to break the expression into multiple lines if you want to cast e.g. a single parameter rather than the result of the whole expression.

In my ideal world Python would add a cast soft-keyword, then the interpreter actually can just strip away the cast entirely in byte code, including its type expression. But even if it doesn’ strip the cast away, it can at least always defer evaluating the type expression, just like with the type soft-keyword, without having to wrap the whole thing in a string.

tungol · January 11, 2024, 10:31am

I don’t think a new form of type comment is a good idea. The ecosystem has been moving away from those in general for a while now, and that seems like a step backwards to me.