Statically typed variables enforcement

I find the motivation section very weak.

(I don’t know what Python means in this context, I’m assuming you mean Python functions and methods.)

If you run a type checker, this isn’t true. Maybe you can just run a type checker?

A small segue here. I’ve noticed that the Reddit brigade looks down on Python typing since it’s “optional”. But… As far as I know, statically typed languages don’t generally do runtime validation of function arguments either. One of the main points of types is that they lower overhead at runtime by doing some of the work at type-checking time. If you pass a Java function a list of a million integers, there’s no code that performs a loop checking if each element is actually an integer.

Disregarding the fact that bool is a subclass of int and the proposed solutions using isinstance would definitely allow this, again, if you run a type checker, these errors will not be silently ignored. Why not run a type checker?

That’s neither here nor there.

Taking a step back - checking types statically, where possible, is strictly better than checking them at runtime. Situations where it’s not possible (edge validation) are already covered as well as they can be.

If this feature was added, who exactly would use it and why, and why would it be better than running a type checker? (You know, questions that would be answered by the Motivation section.)

6 Likes

I don’t think that you are not explaining it well. Rather you are (mistakenly) presuming that others do not already know what you are saying.

Everything you have said here is also true of typing in Python if you use a type checker:

  • You only need to declare the types of things that can’t be deduced by the type checker.
  • In practice annotating function parameter and return types is usually sufficient.
  • Sometimes you need to annotate e.g. an empty list in the middle of a function.
  • An IDE can then tell you the types of all variables.

The difference is that in Python the code still runs if you don’t add any annotations and consequently there are libraries that you will want to use that don’t have them. That can be awkward but it is unavoidable because if static typing was enforced then you just wouldn’t be able to use those libraries. You already have the option to just not use those libraries and presumably you have already concluded that that is not a good option.

6 Likes

Python is not Rust (or any other language). PEP 484 explicilty states:

It should also be emphasized that Python will remain a dynamically typed language, and the authors have no desire to ever make type hints mandatory, even by convention.

While type hints can be useful, they come at a cost. If the interpreter were to be changed to do type checking, everyone would have to pay the cost of having the feature available.

6 Likes

Ned pointed this out earlier, and I think it might be worth a reminder:

Rust doesn’t do runtime type checking. Java doesn’t. C++ doesn’t. Most popular languages don’t.

In those languages, the type checker runs before runtime. And you can do the same thing in Python, if you want to. It’s just more optional in Python.

Adding runtime type checking would not make it more like Rust, it would be moving further away from Rust.

3 Likes

@oscarbenjamin : You seem to keep pivoting, and never admit you’ve been wrong. It may be rude to ignore you, but repeatedly pointing out all the points you’re wrong on seems futile too, especially when you just pivot to new wrong points.

^^^ This, specifically, is wrong. You quote my pointing out this is incorrect, and contiue “Yes and…”

No I don’t pressure libraries to provide type annotations. I do use type stub libraries.

Some libaries that I have incredible respect for (polars) try their very best to type hint well, but they can’t. Because of the limitations of the current type system.

Reworking the type system to make it good would make it possible for all libaries to be typed fully, and would make it easy for libraries to make it typed fully. I have not (at any point) proposed simply making the typing system as it currently stands mandatory. The point of my “proposal”[1] is a reworking of the type system which would make tying easier and possible. If that was implemented, that would benefit me.

No, in current python, even when I use only libraries that are polished to the very highest standards, the IDE cannot tell me the types of all variables. Because it is impossible for example to specify that x is a polars DataSeries of dtype DateTime. Cannot be done. It also cannot be done to specify the schema of a dataframe. So if you iterate over df["start_time"], you have to type hint the iteration variable. This is a simple example with a relatively simple solution, but in real code more complicated problems turn up.

I’ve also got an issue where I am correctly type hinting a input variable of a function as
Callable[[Arg(datetime | None, 'start_execution_time'), Arg(int, 'offset'), Arg(int, 'limit')], Coroutine[Any, Any, list[InternalTradeDict]]] [2]
and I pass in a method which has signature

async def f(
     self,
     start_execution_time: datetime | None = None,
     offset: int = 0,
     page_size: int = 150,
) -> list["InternalTradeDict"]

and the current typing system can’t recognize that it’s correct.

The current typing system has problems. These problems are a necessary consequence of libraries being allowed to be untyped.

The cost of requiring libraries to be typed would be huge. The cost of throwing out our current typing system and building a new (robuster) typing system would be huge. I can see the perspective of people who think the cost is more than what’s worth paying. But you specifically are just repeatedly wrong.


@beauxq : I too am opposed to runtime type checking. Focusing on Rust like this [3] feels like you’re targeting me, which is a little weird since we agree :wink:


  1. which I posit as a something we might want to think about if we ever start developing Python4. ↩︎

  2. Using a Protocol with a __call__ method ↩︎

  3. in a thread in which I’m the only one who has repeatedly used Rust as a positive example ↩︎

Just passing by with a friendly reminder that I don’t perceive how this could possibly be needed!

Not outside specialized functions/classes which use the type-annotation information as runtime information to actually convert data.

Please perceive that no static-compiled language have itself a “runtime enforcement” of typing:
The typing information is always used at compile time.
If you get a program in C, Java or Rust, and happens to feed it a pointer to an object other than the one expected during build time, you just get the program crashing. Hard.
(say, you change a column type on a database from Integer to String and rerun the same code)

So, if one is interested in type-annotation protection (At the expense of ease of writing and more dynamism) in Python, it is optionally there. And if a project passes the strict rules for typing, without flaky workaround “just in order to compile”, Python will be just as safe as any other static, compiled-to-optimized-binary language.

5 Likes

Python’s static type system is different from the type system of Rust/Java/C++, in that it is a gradual type system. This means that (unlike those other languages), not every term necessarily has a type that is statically known to the type checker. This means that the type checker cannot ensure soundness: the runtime types may belie the types “known” to the static type checker, even in fully typed code. Here’s a trivial example:

def fully_typed_takes_int(x: int) -> int:
    return x + 1

def untyped_takes_something(x):
    return fully_typed_takes_int(x)

untyped_takes_something("foo")  # boom

If it helps, you can imagine these functions living in separate modules; separate libraries, even.

The function fully_typed_takes_int is fully statically typed, but the potential existence of untyped code anywhere in the program means that even this is not sufficient to ensure soundness.

The way that this is typically handled in “sound gradual typing” is by having the type checker (which in this scenario must also be the compiler, unlike in Python today) insert runtime casts (“runtime type checking”) at those program points where a dynamically-typed value is assigned to a statically-typed destination. So in the above program, a runtime type check would be inserted at the call fully_typed_takes_int(x), where we are passing an untyped x to a typed parameter of fully_typed_takes_int. This doesn’t prevent the program from blowing up, but crucially it blows up slightly earlier, at the point where the untyped code does something unsafe. Which ensures that the type annotations in the program are sound: fully_typed_takes_int (or other fully-typed code) really can trust that its parameter is always an integer.

“Sound gradual typing” is something that historically has mostly only existed in academic projects, but not entirely. The “mypyc” compiler (which compiles typed Python code to C extension modules) implements sound gradual typing; it has to, because it wants to make use of the statically-known types to optimize the code it generates, which requires sound types (because you don’t want to segfault if called with wrong types).

I think that performing runtime type checks everywhere in Python is not very interesting, because it means that fully-typed code pays a massive performance penalty for tons of useless internal runtime type checks where the type checker can already verify correctness with no need for runtime checking.

Proper sound gradual typing, in contrast, requires that the type checker be integrated into the bytecode compiler, so that minimal runtime checks to ensure soundness can be inserted at only the necessary program points. I think that this is a very interesting direction to continue exploring in Python (though I don’t expect anything in this vein to make it into core CPython anytime soon.)

7 Likes

It has already been mentioned that verifying the types at runtime is generally impossible. I’d like to give a few examples to illustrate that. I haven’t seen them in this thread, if i missed them, I apologize.

So, the examples:

  • checking at runtime whether a function argument is a list[T] takes O(n) time, which is not always acceptable, but at least it’s doable.
  • checking whether a function argument is an Iterator[T] requires consuming the iterator, which makes the iterator useless and might have side effects. The implementation might wrap the argument in itertools.tee or something equivalent, but that requires storing all the elements in memory at once and defeats the point of having an iterator, and still might have side effects. Worse, such runtime check is not guaranteed to finish. A workaround would be to wrap the iterator in an object that performs a check every time when __next__ is called, but it would a) mess with other kinds of runtime type checking, like isinstance or manually checking the return value of type, b) cause the TypeError to be raised far from the wrong annotation and only in some scenarios, as opposed to be raised consistently when calling a function with wrong arguments.
  • similarly to the previous example, checking whether a function argument is a Callable[[T], U] can only be solved with a wrapper. Running the callable is both dangerous and futile: just because it accepted a T instance once doesn’t mean it accepts T as opposed to a subtype of T, and just because it returned U once doesn’t mean it will always return U.

I don’t think decorator-based enforcement can be implemented in a way that’s more consistent than what we have now.

5 Likes

To add onto this, there are some cases where it isn’t just utterly impractical to check whether a value is of a certain type, but genuinely not possible given the information the runtime has available. For example, what is the type of [False, True]? Intuitively, the answer is list[bool]. But actually, it could be list[int] or even list[object]! And this difference genuinely matters, it determines whether you can call .append(123) on it.

From a type theory perspective, each generic type like list actually is a type level function that maps an input type int to another type list[int]. We generally write that output type as though it’s basically just the function with some other unimportant stuff, but from a certain type pov, list[int] and list[str] have about as much to do with each other or list as int, str, and MyClass do. But of course, this isn’t reflected in the Python runtime at all. The object created by [False, True] has absolutely no recollection of any type other than list. In many “simpler” type systems the generic type list doesn’t even have a constructor, the only way to actually make a list is to start with a concrete type like list[int] and then instantiate it. Of course, this doesn’t have to be via an explicit type annotation, many languages like (afaik) rust and C++ are happy to infer the type the vast majority of the time, but the compiler will still have to pick one specific parametrized generic type and emit different bytecode depending on that.

So you really can’t just add a bunch of isinstance calls in the right places. Python’s type system lets you represent types that are too complex for that, and it’s runtime is fundamentally missing information it would need.

6 Likes

many languages like (afaik) rust and C++ are happy to infer the type the vast majority of the time

C++ is funny in this regard: if a variable is of type std::vector<int>, then assigning {} to it works, assigning {1,2,3} works, assigning std::vector{1,2,3} works, std::vector<int>{1,2,3} and std::vector<int>{} obviously work, too, but std::vector{} is not allowed.

2 Likes

I think this is very interesting.

A question: can’t an optional type checking in bytecode help the JIT too?

Probably not much. The JIT (via the specializing interpreter) rapidly acquires runtime type information, which is more directly suited to its needs anyway. It cares about concrete runtime types; many of the often more abstract static types don’t really help at all.