I only mention it because I did exactly this in one of my first internships for a Java interpreter. That interpreter also had “an intricate graph of pointers”. That cache turned a 2 minute startup time into a few seconds. So I watched this technique work with great effect.
It would indeed be better alternative to make typing
imports instantaneous. Otherwise, having a __type_checking__
(or the other one) would still be beneficial.
Heya, my first time commenting on a proposal.
I just have to ask, because I haven’t seen it in the “Rejected Ideas” section of the PEP. Nor have I seen it mentioned whilst quickly scrolling through the current discussion.
There’s been various suggestions to add typing-only imports to Python (mainly through the new type
keyword with import
statements, à la TypeScript).
Here’s a single comment that links to multiple relevant topics: "import type" statement to replace typing.TYPE_CHECKING idiom and fix circular references - #2 by bschubert
Has such a proposal as an alternative been considered? Even if the reason for “rejecting the alternative” is simply that the proposal from PEP 781
is simpler and easier to get in earlier, whilst the specifics for a lazy or type-only import syntax get hashed out, if ever. I think it’s worth at least acknowledging. (and they can both exist together in the future, even in the hypothetical that TYPE_CHECKING
/__type_checking__
becomes obsolete).
In any case, thanks for this proposal, whatever form the constant takes in the end.
Agreed, we should make it instantaneous but the only true way to do that is by not importing it at all. There are various comments above that it would be better to make typing imports faster rather than avoid importing the typing module. These are not exclusive and it would be good to do both.
Many people like to use the runtime features of the typing module and will want to import it at runtime. I am sure that import typing
can be made significantly faster since there is no obvious reason why it should take 10ms. Bringing that time down will improve many Python programs.
On the other hand in my own use of typing I do not want any ruintime features from the typing module. It exists purely so that I can write x: Any
etc and then run a type checker separately from running the actual program. I don’t see why we should have to trade any measurable runtime overhead in the actual program in exchange for being able to use a static type checker which by definition is not a runtime thing. No matter how much import typing
is made faster, any nonzero time that it takes will still be pure overhead at runtime that should ultimately be eliminated.
I don’t see why the goal should be anything other than zero cost static typing.
This PEP does nothing to block future work on making imports faster, enabling lazy type imports, and implementing other great features that we’ll hopefully see in the future. However, it does offer a very pragmatic solution to a real problem we have today. I’m strongly in favor of this. Hopefully, a day will come when if TYPE_CHECKING
blocks will be obsolete. When that happens, at worst, we’ll have a useless constant in builtins
.
I didn’t put making import faster to “Rejected Ideas” because they are not mutually exclusive.
Python saves type hints in runtime, unlike TypeScript. Runtime cost of type hints can not be zero just by optimization. Programmers need to use TYPE_CHECKING
forever to avoid the runtime cost of typing-only code. It won’t be obsolete.
When typing
import becomes ~1ms, this PEP can not make startup faster compared to from typing import TYPE_CHECKING
.
But this PEP still solves “key in the box” problem: constant for avoid typing-only code is in the module for typing.
So we don’t need to worry about totally unnecessary constants remaining in builtins.
Lazy/type import are not competiter with this PEP. Lazy/type imports will replace some use-cases for if TYPE_CHECKING:
, but not all usages.
And if TYPE_CHECKING:
needs TYPE_CHECKING
. Lazy/type import doesn’t help it at all.
from typing (lazy|type) import TYPE_CHECKING
if TYPE_CHECKING: # !!!
from typing import overload
@overload
def f(...): ...
@overload
def f(...): ...
@overload
def f(...): ...
... # heavy typed annotations...
def f(...):
...
I basically agree with you, but I think there’s a less obvious dimension to this that’s worth pointing out. In a world or language where it’s obvious that reducing startup time by 10ms is worth doing, it would also be obvious that a change that increases startup time by 10ms is much less likely to be accepted, and in many cases may be a nonstarter. That in turn means that even a change that increases startup time by 1ms must be subject to much greater scrutiny, because otherwise an accumulation of ten such changes can raise startup time by 10ms and cancel out the (hypothetical) effort to reduce startup time by 10ms. Extending this logic, although presumably there is some low increment that would be readily accepted, in general any increase in startup time would face strong headwinds.
In other words, if we want a “fast language”, then it’s not enough to think only about changes specifically intended to reduce startup time. We would need to adopt an explicit policy requiring all changes to be vetted more strictly from a performance perspective. Based on various PEPs and proposals I’ve seen, this is something that comes up, but my impression is the level of stringency that’s applied is not commensurate with the mindset of “a 10ms reduction in startup time is worth doing”. For instance, I would be surprised to see a discussion on this forum in which a serious objection to a proposed change was that it would increase startup time by 1ms.
In my mind this is one reason why it’s difficult to argue for such performance benefits in individual cases like this. It can seem futile to fight for a 10ms savings if an unrelated change (or a few changes) may wipe it out without much consideration. It is sort of like trying to remind your family members to turn out the lights to save a few bucks on the electric bill. A savings of $10 may be worthwhile, but it’s hard for people to see it as worth worrying about if you’re likely to splurge on a $100 dinner every couple weeks.
It may be worth taking performance improvements seriously, but if so, it has to also be worth taking performance declines just as seriously, and subjecting all proposed changes to a consistent level of performance-based scrutiny, not just ones that are proposed specifically for performance reasons. That would require a more global shift in perspective and decision-making process, which may be a bigger step than some people want to take.
I would like to see attitude change for these things. Especially I think that startup overheads are important and are often under-emphasised in typical benchmark measurements. Personally I don’t use dataclasses for example because it adds 1ms of import time per class. I don’t see why dataclasses was designed in such a way that those overheads are required. Lots of people are happy to use dataclasses though so I guess they don’t care about that or maybe they just don’t measure anything.
Someone suggested that SymPy could use dataclasses but there are about 1000 classes and so it would slow import time by about 1 second (currently it is about 400ms on this slow comuter which is already too slow but reducing that is hard without breaking compatibility). If you want to see a real example of how that looks for a somewhat similar package then try libcst. It takes 295ms to import libcst
on this machine and running under a profiler shows that that is because it has 200 dataclasses (one for each expression type much like sympy). I don’t think any potential advantages of dataclasses are worth that very real and noticeable overhead but apparently the libcst developers don’t care.
The difference between dataclasses and typing though is that if I don’t want to use dataclasses then I just don’t use it. Or if I want to use a faster alternative instead then I can. As I said above the typing module in particular is special because the type checkers don’t let you go make your own faster version. The other thing that is special about static typing in general is that it is obvious to all that there does not need to be any actual runtime overhead because the type annotations are all just there so that you can run a type checker and the type checker does not actually execute any of the code.
It seems to me that you are confusing import time and runtime overhead. It doesn’t matter how many dataclasses are being defined, the import time will always take the same time to import the first time (~12ms on my end). This doesn’t mean we shouldn’t care about the runtime overhead of defining a dataclass, but these are two different concepts.
I believe Oscar meant the import time of a module using several dataclasses, as class bodies (and class decorators) are executed at on module load, at import time.
A
You seem to misunderstand:
LibCST is written in rust, therefore it is fast (by definition).The import time has nothing to do with that, nor does any other measure of runtime cost or performance benchmark.
Is it an attempt at joking or are you saying this seriously?
The import time discussion is drifting off a little. To get the discussion back to topic, I try to summarize my view on the key arguments for the PEP:
We need some variation of a TYPE_CHECKING
flag for the foreseeable future because that allows to skip imports of any additional libraries that are only imported for typing purposes and resolve possible import cycles.
The question is: Is typing.TYPE_CHECKING
good enough or do we need something better? Improvements can be:
- make this available without import:
- prevents 10ms import cost of typing for some cases (though in practice typing will often still be imported for other reasons)
- is it a semantic advantage that this is a builtin rather than imported from a submodule?
- reconsider naming: Is
__type_checking__
a better name than TYPE_CHECKING? I.e. does the dunder better convey the special semantics compared to a constant?
These possible improvements have to be weighed against the cost of change. Note: To me, the semantic arguments are stronger than the import time argument, but still not sure whether it‘s worth it.
Consider me +1 for getting the type checking flag out of the typing module.
I haven’t seen this in the discussions to date: would it make sense to put the TYPE_CHECKING
constant in another module like sys
.
I’m suggesting sys
because AFAIK it is essentially guaranteed to be in memory, and one could plausibly consider whether the interpreter is type checking is a property of the system. Other modules may be similarly appropriate.
It’s just a suggestion, but I wanted to at least raise it for consideration. (By apologies if this has been raised before and I missed it.)
I think so. I have updated the pep about it:
Future optimizations may eliminate the need to avoid importing the
typing
module for startup time.
Even with such optimizations, there will still be use cases where minimizing
imports is beneficial, such as running Python on embedded systems or
in browsers.
Therefore, defining a constant for skipping type-checking-only code outside
thetyping
module remains valuable.
I am waiting the vote finished.
We never discussed about using sys module.
User can use the constant in the built-in module more easily, but I think using the sys module is also a good idea to avoid increasing the number of built-in names. I don’t know which is better.
I initially proposed the unassignable built-in __type_checking__
to remove typing-only code like __debug__
removes debug-only code.
But I can’t demonstrate the effect of code removal with a simple experiment using SQLAlchemy.
I thought the benefit of being able to write if TYPE_CHECKING:
as before was significant, so I discarded the idea of code removal along with the name __type_checking__
.
However, it seems that few people support TYPE_CHECKING
. The benefit of writing if TYPE_CHECKING:
as before may be smaller than I thought.
If we don’t need to use if TYPE_CHECKING:
as it is, I will reconsider the idea of code removal.
Since SQLAlchemy uses from __future__ import annotations
a lot, the effect of code removal may increase when it is completely migrated to PEP 649. (PEP 649 generates a large number of code objects and function objects for lazy evaluation.)
FWIW I voted for __type_checking__
(with the implicit understanding that the fully-qualified name of this is builtins.__type_checking__
) because currently, the different type-checkers behave differently in their recognition of typing.TYPE_CHECKING
, so re-using the name TYPE_CHECKING
may actually cause more behavioural confusion in a few circumstances.
Some inconsistent behaviour from different type-checkers currently:
- mypy & pyright don’t recognise this as the
TYPE_CHECKING
flag:from typing import TYPE_CHECKING as _TYPE_CHECKING
- pyre1 doesn’t recognise this as the
TYPE_CHECKING
flag:import typing as _t; _t.TYPE_CHECKING
Here’s the snippet I’m using to test on the various type-checkers; IMO only pytype gives completely expected behaviour if you think of typing.TYPE_CHECKING
as just another module attribute in which type-checkers see as of type typing.Literal[True]
.
from typing import TYPE_CHECKING as _TYPE_CHECKING
import typing as _t
if _TYPE_CHECKING:
a: int = []
else:
b: str = []
if _t.TYPE_CHECKING:
c: int = []
else:
d: str = []
CHECKING: _t.Final = True
if CHECKING:
e: int = []
else:
f: str = []
Whatever the name chosen, It’d be great if the PEP additionally clarifies that either
builtins.{TYPE_CHECKING, __type_checking__}
and any import aliases of this are expected to be treated asTrue
by type-checkers, or- Due to implementation details by various type-checkers, (1) is not possible and that the only guaranteed recognition of the flag is the name
TYPE_CHECKING
/__type_checking__
.
I just noticed that I haven’t answered this. Reusing the name TYPE_CHECKING
for a builtin means that from typing import TYPE_CHECKING
– which is widely used – will start shadowing a builtin in a future Python version. It also means that the definition in the actual typing.py
module will now shadow a builtin. While this is probably not a huge deal in practice, it means that linters which flag shadowing builtins will need to special case this, otherwise users would be presented with a lot of new warnings that are hard to work around.