We could raise an error when you try to use it, right?
I’m curious what the behavior you’re worried about is here such that an “error-on-reification-if-it-was-lazy” would help. In general, we think that a particular module itself isn’t ever “incompatible” with lazy. Consider that, today, someone using your library might have already put all imports of your library inline or otherwise done unusual things with the import ordering (or even used LazyLoader!), and you don’t currently get a say over that, as the module being imported.
The issues that we observe in practice typically are the “reverse,” where a module making an import statement has particular semantics/ordering in mind about the context in which that import will evaluate. That usually only shows up if you’ve globally enabled lazy imports, which — as said here — is not the usecase we’re targeting, and the existing and proposed solutions would address that.
I was told asked at the core dev sprints to comment due to spending way too much time thinking about this as the author of importlib and importlib.util.LazyLoader. Basically I like it and support it!
I honestly only have a couple of comments about flow in the PEP and one misunderstanding.
it is useful to provide a mechanism to activate or deactivate lazy imports at a global level.
This is the first mention of the global on/off mechanism, and it’s after emphasizing how lazy import is local. It probably could either use more details or just a “more later” comment.
will check the global lazy imports flag
First mention of the flag with no context.
the lazy imports filter
First mention of the filter with no context.
ImportError: deferred import
This is a terminology flip from “lazy” to “deferred”. For consistency it’s probably best to use the same term as the keyword in the exception message (I leave it to the PEP authors which term to choose ).
No ongoing performance penalty unlike importlib.util.LazyLoader.
This assertion is made twice in the PEP and it’s actually incorrect. LazyLoader replaces __class__ for the module so that __getattribute__ gets changed to ModuleType (or whatever loader.create_module() returns so there is no perpetual performance penalty:
The key benefit of this PEP over LazyLoader is:
It doesn’t require wrapping import machinery to make it work.
It makes finding the module lazy (LazyLoader is eager in finding the module but lazy for loading).
It would simplify so many things, I see it would help things for users of all my projects. And I like that it explicitly accounts for if TYPE_CHECKING and PEP 649.
I’d like to thank Pablo and the other PEP authors for writing and shepherding through this proposal. As mentioned in the PEP, the scientific Python ecosystem is a prominent example of lazy loading for libraries, as seen with our lazy_loader library which is fairly widely used (hat-tip to @effigies and others who helped maintain it).
Our goal was to enable lazy loading without downstream users being any the wiser. Mostly, that worked fine; we supported the standard import syntax and type checking (by parsing imports from .pyi files), and exposed lazy modules by using module-level __getattr__ in combination with LazyLoader (a strategy outlined in a 2018 blog post by Brett). However, I was always nervous about the fragility of the approach.
For example, under very specific circumstances doctests fail with lazy loading enabled. And at some point we ran into a race condition, which had to be addressed with a lock.
Communicating with Brett in 2021 (thank you for your helpful advice, Brett!), the conclusion at the time was that this was unlikely to ever become a standard language feature; and yet, here we are Now, we can figure out how to retire lazy_loader and rely on a standard mechanism, while still having our libraries load fast. I am very appreciative of everyone who made this possible!
First, I’d like to thank everyone who worked on this PEP.
EDIT 2: transformed this suggestion in it’s own thread.
I would like to chime in with the idea of using a different keyword for type-only imports, as they will never be evaluated (reified?). This should make it clearer for readers to understand that it is an import that only exists for type checking reasons.
For example:
from typing import type Dict
I know there is the type keyword already, so maybe another word could be used in place.
I’m unsure if commenting here is the best option to suggest this, or if this should be in it’s own PEP, in which case I would probably not follow with it, as I don’t see any performance changes (maybe very small since the type imports would never be evaluated, but that’s probably irrelevant).
Again, thanks to everyone involved in this PEP! Looking forward for this being introduced into Python!
EDIT: I had previously sent an email to Pablo, since I was unsure about where to send suggestions regarding PEPs, and he replied with some very good points to have this as another PEP instead of using this one. I will for sure read PEP 1 and 12 to understand how I can get started with this. Amazing work on this one!
Shame about the decision to exclude lazy imports from try/catch blocks. I understand the other scope-based scenarios, but I have found try/catch on import errors useful for handling code dependency migration in some cases, such as import locations changing in Apache Airflow code when updating versions and still supporting old versions for a period of time in user code.
Maybe this could still be done in a similar simple manner with the new lazy syntax, but haven’t thought of it yet. Otherwise very exciting proposal
I love lazy imports, but I don’t like the lazy keyword. Here is why: Everyone will use it, because it either results in better performance or will have no effect. So, every Python code will from now on use that more verbose syntax, unless they really need an immediate import, which is less often needed.
Therefore, better would have been:
A transition period, from __future__ import lazy_import.
After that, import X and from X import Y becomes lazy by default.
For the cases where an immediate import is needed, the more verbose syntax: import now X and from X import now Y.
I’d hope anyone new to Python wouldn’t be dealing with lazy imports! Javascipt will be familiar to some and they have a simple approach. import used as a statement like current Python is synchronous and used as a function is asynchronous (aka lazy).
import eager import("lazy")
The point of the await was a place to explicitly get errors from from a failed import. The PEP currently only shows how to get an exception from accessing a member which you have to know is present. It can get very confusing telling the difference between submodules and names for more complicated packages, especially across versions over time.
I’m guessing simply making all imports lazy (no new soft keyword) leads to breakage, or at least unexpected behavior? I only skimmed the rejected ideas, but didn’t see it as something you tried. I do see you reasoned about it.
An alternative with no new keyword would be a __future__ import. Not as granular as the current proposal, but still controlled within the module.
try:
from foo import bar
except ImportError:
from baz import bar
If this becomes implicitly lazy, then we only have two options as far as I know, neither very nice.
One is that we ignore the error handling because the lazy import didn’t error inside of the try block, but that clearly is breaking.
The other is that when the import is forced to evaluate, we execute from within the try block, which makes imports act like “traps” which can force any use of an imported name to jump into some distant suite. This option might not break any realistic code today, but those semantics sound pretty yucky to me (not to mention hard to implement).
+1 on the feature and PEP, badly needed, the only thing I miss is having a way to say that in a given module, all imports should be lazy without requiring the lazy keyword.
– maybe __lazy_modules__ = [‘*’] could be supported?
For that to be meaningful, the lazy keyword would need to work the same way LazyLoader does (eagerly ensuring the module spec can be found, deferring the actual execution of the module code). If the lazy imports worked that way, the eagerly looked up __spec__ could be made available on the lazy placeholder objects so it didn’t need to be looked up again when the import was reified.
Using those semantics would also potentially open the door to allowing enabling lazy imports globally to affect imports inside try/except statements and context managers, since “import attempt as existence check” would still work. (It’s only “potentially”, as the handling of exceptions raised as a side effect of running the module’s code would still no longer work as written).
While I don’t really mind if the PEP sticks with its current semantics, I do believe it would benefit from a “Why not eagerly look up the import specifications for lazily loaded modules?” section under “Rejected Ideas”.
Thanks for making the PEP. This looks much improved compared to the previous attempt at a lazy import system with PEP 690.
I ended up rolling my own lazy importer for use with my own CLI tools and it would be nice to be able to replace this. I’ve measured this making short-lived CLI tools run in half the time for some tasks so having this as a built-in feature is definitely worthwhile.
A few questions:
Was it ever considered to use a context manager instead of new syntax?
The previous PEP proposed adding an eager_imports manager to importlib as an opt-out so I was wondering if a lazy_imports manager was ever considered as the mechanism for an opt-in instead of new syntax.
You briefly mention considering using a subclass of dict to enable lazy imports, but with the mention of calling __getitem__ directly I’m a little confused as to how it was consiidered.
My expectation would have been that a __missing__ method could be used to check if an absent name was a lazy import, which would be supported by __getitem__, is there a reason this is not the case?
I understand why the try/except syntax can’t be supported directly by syntax, but is it possible to have some other way of supporting the concept by providing a way to embed the try/except logic inside the proxy object that would execute on reification?
Alternatively, for the case of try/except as existence check, could there be a helper “module_exists” function that can be used to convert from a try/except into an if/else that can be used with the lazy import form?
It’s possible to globally opt-out with -X lazy_imports="disabled", presumably this will cause a module import to fail if a lazy import has been used to hide a circular import?
Might it be necessary to add a filter to opt-in for specific modules in the same way modules can opt-out of being lazily imported under -X lazy_imports="enabled"?
Specifically, cheap module existence checking. I think the fact that it couldn’t work for attributes in FromImports would make it too subtle, especially for newer users.
But I agree that it is worth recording why the authors rejected those semantics. Their reason might be different (e.g., the pure performance question).
I’ve added a mention of this to the rejected ideas section. Adding eager would be confusing unless we were actively working to changing the default behaviour. We aren’t, and certainly as part of this PEP we won’t be.
I’ve removed the assertions about performance.
Yes, it should. The default doesn’t change. This PEP does not anticipate the default will change. PEP 690 was the closest thing to a proposal that might lead to changing the default, and it was already rejected.
Making all imports lazy is a big change. There’s a reason it’s a knob the PEP provides but discourages. It’s a huge breaking change. There are too many modules that just won’t work right, or will work right only in specific circumstances (like something already imported one of its dependencies.) It may be doable in ten or fifteen years, after we’ve shown that lazy modules work great, and we’ve developed new patterns of module behaviour to replace the existing ones that rely on eager imports. Me, I’m not holding my breath.
The closest proposal to that would be PEP 690. It was rejected in part because it made all imports lazy. (It’s mentioned in this PEP.)
That wouldn’t fix the issues with the existing context-manager based solutions. A hybrid where the syntax this PEP used is based on with would open a whole new world, since with has never been semantically important to the compiled code, which is quite a big lift. I’ve added it to rejected ideas, but if you want to propose a competing PEP with that syntax, feel free
The names are there, we just need to something extra if they’re of a specific type. Or we could store them somewhere else, but then we would need to do multiple lookups and keep the dicts in sync. And either way, it still requires a dict subclass, which is problematic for all the reasons mentioned in the PEP.
No, the filter intentionally is only called for potentially lazy imports. It’s meant as a safety measure, not an inversion of responsibility. And yes, it is intentional that disabling lazy imports globally would expose import cycles that only work with lazy imports. The advice for import cycles remains pretty much the same, even with lazy imports: refactor the code so you don’t have an unresolvable cycle.
The “refactor the code” argument doesn’t work if lazy imports are used where typing-only imports would otherwise result in import cycles. One of the major non-performance related benefits of the PEP is that lazy imports can be used in place of if TYPE_CHECKING for this. Any time that happens though it will be broken by anyone using -X lazy_imports="disabled" which applies globally to every module in every library in the process.
Other than GIL optionality, I’ve never been more excited for a programming language proposal! What a joyous Saturday this is
This will directly benefit me in a few ways:
I maintain several popular CLIs, both open-source ones and at work, that import almost everything lazily in order to maintain the approximate responsiveness users expect from command line tools written in compiled languages. I have to block merges based on timing regressions (in part due to insufficient static analysis) and maintain a cultural awareness of these hacks. As an easy illustration of what happens when one does not do this, check the time it takes to see the version of any of the CLIs (all written in Python) for the 3 major cloud providers: hyperfine -m 10 --warmup 1 “(aws|az|gcloud) --version”
At work, for years I used to maintain the repository containing most of the integrations that get shipped with the Datadog Agent. The structure is such that every integration is part of a namespaced package (although not a requirement) with a central package depended upon by all of them which contains the required base classes and other utilities. We support hundreds of integrations and the vast majority must be explicitly enabled by customers rather than enabled by default. This requires the base package to make heavy use of manual lazy imports and the more recently introducedlazy-loader library from the scientific community as mentioned above. If we don’t do this, then memory balloons by having to import everything for features that aren’t even enabled. Fun fact: importing the official Kubernetes client has a fixed memory overhead of ~40 MB because they offer most functionality at the root by importing everything immediately rather than lazily.
Personally, I use serverless functions for a few things. In order for that (quite popular) scenario to have fast response times and lower costs one must either separate logic into separate units/APIs that have static requirements or use lazy imports conditionally when calls trigger those code paths.
This is a fantastic point and something I would strongly recommend considering. If it doesn’t work this way then users would still require some hacks to avoid the overhead of checking installed packages.