PEP 781: Make ``TYPE_CHECKING`` a built-in constant

That wasn’t my understanding of the current proposal. PEP 649 replaces a function’s annotations with a new function, including cell variables, that may be evaluated later (or not).

How many individual annotations do you think are needed before it takes longer to compile a source file (and takes more memory) than just importing typing once and resolving the types immediately?

My bet is that some libraries will flip the balance on their own, due to the large number of annotations they use. But in any case, once you import hundreds or thousands of source files, you’ll get tens of thousands of additional functions being allocated to avoid having to do a compile-time getattr(typing, NAME) or looking up names in builtins[1].

And still, it only takes one module to do an unprotected import typing and the benefit is gone anyway, and you’re left with only the overhead. I don’t think we can rationalise making TYPE_CHECKING a builtin based on performance - it has to be about convenience and more tightly integrating the concept of type checking into the Python language (which I’m also against, but I’ll argue against that when someone seriously proposes it as the motivation :wink: ).


  1. Which is entirely necessary for forward references, but simply overhead for the majority of cases. ↩︎

1 Like

There’s more overhead with 649 compared to the pre-649 annotations future, but importing typing doesn’t remove that overhead, 649 will continue to do that whether typing has been imported or not, and it’s still necessary for the case of circular references that are only circular in types, not in runtime (this can easily be the case with any AST-like library or library that has any sort of relational data representation if types are defined in multiple files for maintainability reasons)

2 Likes

“Takes longer to compile” is not relevant – nobody who cares about startup performance is re-compiling from source to bytecode on every startup.

PEP 649 does have some overhead, sure. There are probably still things that can be done to reduce it further. Larry’s original prototype stored only a tuple with code object and a few other things, and only lazily created the function object on demand. AFAIK that optimization isn’t currently implemented in the 3.14 version.

But it’s really backwards to hold the PEP 649 overhead as somehow a count against the performance motivation for if TYPE_CHECKING, as if PEP 649’s overhead were there because of people wanting to use if TYPE_CHECKING. That’s exactly backwards. The people who want to minimize runtime overhead of static typing were perfectly happy with stringified annotations and PEP 563. We have the additional overhead of PEP 649 in order to better support runtime use of annotations, and because annotations need to support forward references for simple cyclic cases, even with zero use of if TYPE_CHECKING.

So the PEP 649 overhead, whatever we measure it to be, is a sunk cost for entirely separate reasons, and irrelevant to the question of whether if TYPE_CHECKING is a useful performance tool.

6 Likes

I fundamentally disagree with designing Python as if it’s merely a bundle of independent features, but I’m not sufficiently motivated to argue it right now. Hopefully the steering council is thinking about the overall integration of all of these features.

3 Likes

I’m certainly not advocating for this, nor do I think it’s what is happening here. I think rather what is happening is that we are working very hard to balance a number of different competing use cases and interests, which sometimes inevitably conflict with one another. And the fact that we have to compromise on performance of type annotations in order to serve one use case, does not make it contradictory to also care about the performance use case.

9 Likes

This PEP doesn’t rationalize it based on performance. The motivation is unifying how to use it.
This benefit is not gone away even when typing module is imported.
This benefit will go away when no one need if TYPE_CHECKING: block.

3 Likes

It’s already unified, though? typing.TYPE_CHECKING is the single constant that represents this.

If third-party tools do things differently, that’s up to them, and we don’t have to add a second option[1] to accommodate them. But it sounds like they all support typing.TYPE_CHECKING, which is what was intended when it was added.

If developers are avoiding our unified constant because of perceived performance impact, do we need to dispel that notion (or demonstrate its futility)? Again, adding another way to get the constant doesn’t actually fix the problem they think they’re fixing.[2]

I guess I’m just not clear on what the problem is, and the motivation (at least in the original post) calls this a problem(/challenge) to be fixed(/avoided). The two problems I can read from the Motivation section are (1) importing typing takes time, and (2) some type checkers support other ways as well as typing.TYPE_CHECKING. As we seem to have agreed, the first problem won’t be solved, and I don’t believe the second is a problem for us to solve.

An alternative would be to frame it as a better way to integrate static typing into Python code (which I also don’t think it is, but the framing would be valid, and I might be convinced).


  1. Arguably, the opposite of “unifying”, at least during a transition period. ↩︎

  2. I am one of these developers, for the record. But when I care about it I just replace typing.py with a cheaper import. All types derived from it will be meaningless at runtime, but that’s fine for me - I’m not relying on any of them. ↩︎

3 Likes

I think we do. I know I avoid using typing.TYPE_CHECKING because I take the view that type annotations should be zero-cost (or as near to it as is practical) at runtime. I wouldn’t tolerate a linter that caused a performance hit at runtime, and I view type annotations in precisely the same way.

Having said that, I acknowledge that many libraries freely import typing at runtime, and therefore trying to avoid the runtime cost of typing is pretty much a lost cause at this point. That’s a shame, but it’s where we are.

If we do acknowledge that the runtime cost of typing is inevitable, maybe what we should be doing is actually removing typing.TYPE_CHECKING (and the special case in type checkers for it), rather than trying to perpetuate the idea that it’s possible to avoid runtime costs? :slightly_frowning_face:

3 Likes

It’s not inevitable, there are libraries that are fully typed that never eagerly import it. Besides that, removing typing.TYPE_CHECKING would break all sorts of existing libraries due to non-performance reasons for that existing. This would be catastrophic in impact and churn for libraries having to restructure their code to work around type-only circular imports, or drop the typing all together.

6 Likes

Note that TYPE_CHECKING is not only needed because of the runtime cost of importing the typing module nor is it only needed for imports in general. A deferred type import statement could cover most uses of TYPE_CHECKING but it would not make it possible to remove it altogether. Also there is one thing in particular that type import would not work for which is in fact importing TYPE_CHECKING itself since that is the one thing in the typing module that is actually needed eagerly at runtime.

For import time there are different cases but I think it is useful to think in terms of two extremes:

  • A simple CLI program that imports no libraries or only a few lightweight libraries.
  • A large library/application with thousands of modules.

In the simple CLI program case it is absolutely possible to avoid importing the typing module especially if the author of the CLI program is also the author of the relevant libraries (maybe the CLI is an entry point coming from a library).

When there are thousands of modules the number one thing to do for optimising startup time is to ensure that as few of those modules as possible are imported in order to perform some particular task. This needs TYPE_CHECKING to avoid the internal typing-only imports regardless of the cost of importing the typing module itself. It also likely needs TYPE_CHECKING because of cyclic typing-only imports and perhaps also because of optional dependencies.

It is entirely possible to have a library with thousands of modules and still be able to import and use a function from that library without having to import all the thousands of modules. The library just needs to be designed so that it doesn’t import everything up front and organises its internal import graph reasonably.

Optimising the import graph is difficult to do retrospectively but it is not difficult when designing a new library. The number one design choice that makes it hard to optimise this in future is just providing functionality directly from the top-level package like np.array, np.cos, np.linalg.det etc since then all those things always have to be imported before anyone can use even the tiniest of functions from the library.

3 Likes

I’m a bit confused by this statement. What is the alternative you have in mind?
The only alternative I am aware of is to use indirect imports, like X.py imports np.array, and then Y.py imports X.array, but then Y.py still needs to import numpy. Just indirectly.

1 Like

A common pattern is to have a top-level __init__.py that imports practically everything from all submodules and packages so that people can do from pkg import foo rather than from pkg.bar.baz import foo. Nothing from the package can be imported without first executing the __init__.py though so this adds an import time cost that must be paid by all regardless of which part of the library they want to use.

A simple alternative is just that you rename numpy/__init__.py to numpy/everything.py and tell users:

  • Use import numpy.everything as np if you want a convenient way to import almost everything in the library.
  • Or otherwise use e.g. from numpy.fft import fft if you want a faster way to import a specific function.

This is for example what matplotlib does although for different reasons:

import matplotlib.pyplot as plt

Having an __init__.py that imports loads of stuff from the subpackages and submodules makes it slower to import any one thing from the whole package. This applies recursively to every subpackage but it is most important just to avoid doing this in the top-level __init__.py because then at least you reserve the possibility to optimise this in future by refactoring everything into a new subpackage like numpy.api or something.

4 Likes

The decision which functionality is available through the top-level namespace should primarily be driven by usability. Users should be able to do common tasks with only a single import. I would to not want burden users with longer or multiple imports by put functionality into a separate namespace just because its type annotation is expensive to evaluate.

Anyway, you don’t simply change how everybody has to import functionality in existing libaries - neither import numpy as np nor import matplotlib.pyplot as plt can reasonably be changed. And just as a remark, the matplotlib import has specific historic reasons. We would not do this today if designing from scratch.

1 Like

You can have everything with a single import: import numpy.everything as np. You can come up with a shorter spelling for everything if you like but it makes a big difference whether it is the top-level package or not.

There is not a single type of library “user”. It is easy to get carried away with the idea that users are novices who will use your library directly while typing commands interactively into a REPL. Optimising everything purely for that case is often in conflict with what would be best when providing functionality to other libraries or applications.

Agreed. I’m well aware that it is basically impossible to change this. That is why it is worth thinking carefully about this particular question at the beginning when creating a new library though.

2 Likes