No, because the default value is shown in the --help output:
I would like to see the work that has been done.
Specific repository forks I can check out and run with the forked interpreter.
Well, function Parameter defaults are always evaluated before the function is called (at definition time). That means that this wouldn’t work for any kind of lazy import system.
That was exactly me point.
Lazy imports are easy to understand… but then adapting an existing codebase to actually benefit from lazy imports often turns out to be quite an involved task.
Other ecosystems took other approaches:
- tree shaking and “compile-time” module splitting in js/webpack/tollup/esbuild
- compiled languages w/
-ffunction-sections -fdata-sections - rust where feature flags remove entire subgraphs
- android/.net/ios have their own hacks for fast app startup / delayed loading
I think there’s quite a bit to learn from these.
Or work to make Python “faster” overall, e.g. spread module parsing and evaluation across hardware cores, the existing import syntax providing clear and deterministic granularity to do so.
Well, I thought about splitting tasks across cores to make Python faster before, but even as it stands, doing everything we could, Python is still interpreted, and therefore ‘slow’. There will always be limits (and new possibilities too).
If someone really wants to make, say a fast CLI, they would know when to use Python (or rather: When not to use it).
There are many ways to make code fast, and there are a billion possibilities for optimizations of Python.
This lazy import syntax is just the tip of an iceberg, and the basis for future work that might make imports faster. I don’t think, its possible to address everything in one PEP, but I’d love to see you ideas in a PEP once this one has been finished.
Why is it not sufficient to look at the listed projects that have rolled their own implementation of lazy imports? You could peruse their history and even review the discussions of those changes as they happened. You are asking for a lot of work to be done to satisfy your curiosity.
The following questions come to mind regarding the filter function.
- Is it guaranteed to be called in a thread-safe manner? Being a process-global function (since it’s currently specified to be set/get by a
sysfunction, but more on that below), are there any special considerations that are needed, either from the interpreter or from the author of the filter function? The section on thead-safety doesn’t address this directly, but I assume at least the intent is that yes, it is thread safe. That section could add an explicit guarantee.
What does “calling in a thread-safe manner” mean? Do you just mean “it’s guaranteed not to be called from multiple threads”, or “concurrently from multiple threads”, or “concurrently from multiple threads for the same module”? It’s not, for any of these. There’s no locking. It’s just a function that’s called as part of the import mechanism, just like many other functions. An implementation that does thread-unsafe things has to perform the relevant synchronisation itself. I’m not sure it’s ever going to matter, really. I mean, the simple case is a dict or set lookup, which is not a problem. If you need to do a lookup in a container that doesn’t do thread-safe lookups, or if you need to do complex shared state manipulations in the filter function, you’ll have to lock accordingly. None of this is specific to this proposal, and I think trying to spell it out every time sets a bad precedent and a bad example.
“Why you choose lazy as the keyword name” isn’t grammatically correct. Perhaps it’s missing a “did”?
… but you’re okay with the answer?!
(FWIW, it’s a meme Pablo likes.)
I get that
module.__dict__will reify any lazy imports, but I’m somewhat concerned about the readability of seeing baremodule.__dict__in code. Would it not be better to add a.reify()method to module objects to make that explicit? It could no-op if it’s already reified.
The mechanism isn’t being advertised as the way to reify all values in a module, it’s just something that’s done to keep existing code behaving as expected. Perhaps we should have an explicit way to reify everything in a module, it hasn’t really come up yet.
__lazy_modules__→__lazy_imports__to more closely mirror thelazy importsyntax and-Xflag.
I think __lazy_imports__ is too close to __lazy_import__, and also doesn’t convey that it lists modules.
Is it reasonable for a user of a library to expect that the library functions under
-X lazy_imports="disabled"?
(Replying to this one as a stand-in about the suggestion to allow the lazy imports filter to bypass the global disable.)
I think no, the library not working should be fine. The global disable isn’t meant to be a general purpose tool with common use. It’s there for the convenience of people who want to test their own modules with lazy imports disabled. I also think it’s fine for that disable to not be an option for libraries that don’t want it to be an option, and I don’t think it’s a good idea to have a complex hierarchy of enables/disables like a game of top trumps. We may end up needing this but the exact shape is unclear until we have real-world experience with this. I’ve added it to rejected ideas for now (with a slightly expanded rationale) in PEP 810: Update rejected ideas. by Yhg1s · Pull Request #4634 · python/peps · GitHub.
It does highlight a semantic detail though: do lazy imports capture the value of
__import__at the time the statement is executed (presumably on the proxy object), or do they delay looking up__import__until reification occurs?
There are a couple comments about this issue. It is the latter; our intent is that __import__ resolution happens when reification occurs. We don’t capture the state of the import machinery at the lazy import statement. The PEP says
Reification imports the module in the same way as it would have been if it had been imported eagerly, barring intervening changes to the import system (e.g. to
sys.path,sys.meta_path,sys.path_hooksor__import__).
Do you think it would help to clarify this in the PEP?
It’s a little confusing because it’s saying “the import will be the same…except for all the ways it can be different at that point”. I’m not sure what remains after the cases listed in the parenthetical statement.
It might be clearer to switch it around to “lazy modules are imported using the import system at the time of reification” and if there are any exceptions to that statement list what they are.
t might be clearer to switch it around to “lazy modules are imported using the import system at the time of reification” and if there are any exceptions to that statement list what they are.
I agree this is the best way to explain it.
There’s enough mutable state in the import system that I think the only two viable lazy loading options are "look up the module eagerly without executing it (the LazyLoader approach, which the PEP has good reason not to use), and “don’t eagerly look up anything, not even the value of the __import__ hook” (which is the PEP’s approach, albeit not expressed entirely clearly yet).
What does “calling in a thread-safe manner” mean? Do you just mean “it’s guaranteed not to be called from multiple threads”, or “concurrently from multiple threads”, or “concurrently from multiple threads for the same module”? It’s not, for any of these. There’s no locking. It’s just a function that’s called as part of the import mechanism, just like many other functions.
Is it though? The import system doesn’t get invoked until reification, but the filter function has to get called at the point of the lazy import statement (i.e. IMPORT_NAME, IMPORT_FROM), to determine if the import is lazy or eager, right?
… but you’re okay with the answer?!
(FWIW, it’s a meme Pablo likes.)
Ah, that’s
!
But yes!
The filter function is not protected by any locking but it won’t crash if is called concurrently or something like that. If two threads execute potentially-lazy imports of the same module concurrently, the filter will be called concurrently from both threads. After the filter returns, either lazy objects are created (no synchronization needed) or we enter the standard import machinery (which has its own locking) If stronger guarantees are needed because the specific implementation handles some shared resource then it must handle its own thread-safety. For read-only operations (like checking against a frozenset), no special care is needed.
That’s totally reasonable semantics. I think the PEP should add some language to clarify that.
I don’t mind but I think @thomas has a point that this is kind of implied and that if we start to have to write it everywhere (not only on this PEP but other places) it may be a slippery slope. I will let you two to discuss offline ![]()
Some time ago, @ctismer worked on implementing the ideas from PEP 690 for the PySide module (details here) which ended up showing a very nice performance for the startup time of PySide application of about 10-20%. In frameworks with tons of things are initialized at import time, I believe this would be a great addition to the language.
Having an explicit keyword to have this as an opt-in is IMHO the best approach for this new functionality, because I do know a few things can get a bit messy if this was enforced everywhere.
IDEs and debuggers should be prepared to display lazy proxies before first use and the real objects thereafter.
The PEP doesn’t document the proxy objects at all. So how are tools like debuggers to know they are working with a proxy object? Will the type be exposed in types? Will there be any other exposed data that tools like debuggers can show to the user, or will it just be in the repr? I personally only care about the former and the latter is more of a question than request.
Will the type be exposed in
types?
Yes! Either in types or importlib. Any preference?
Shame about the decision to exclude lazy imports from try/catch blocks.
How would you have a lazy import that was triggered away from the import statement go back and have the exception raised in the try block? It’s very “spooky at a distance” semantics that is already tricky enough with the lazy imports (at least the PEP has the exception trigger where the code causes the import). Toss in exceptions suddenly coming from some other line of code and not where the code is executing and it seems like asking for trouble. Plus you would have to save the whole stack at the time of lazy import for that to work and it just becomes a mess and way too complicated to tackle.
As well, for your import failure case, you would want it to find the module first and then if it exists go on to be lazy. That’s how LazyLoader does it already, so you can get that today if you want those semantics. Otherwise you could use importlib.util.find_spec() to see if the module can be found and if it isn’t then do something else.
I realize this all isn’t as convenient as magically having lazy import work in a try statement, but I’m afraid this is an instance where we can’t have everything be as simple as with eager imports.
The vast majority of imports in Python are used to import functions or types, and not to trigger side-effects.
Have you done an analysis to back that statement up? For instance, I see most imports import modules, so it really depends on what codebases you’re exposed to. As for side-effects, technically every import is a side-effect since it’s an exec() call on the source (for the default, pure Python case).
But ignoring all of that, making eager the opt-in is a bit hostile towards beginners where reasoning about lazy imports is harder than lazy imports. In that spirit, I don’t think asking us to type 5 more characters when the perf matters is a big deal.
Yes! Either in
typesorimportlib. Any preference?
Nope, just somewhere. I don’t think any import-specific types are in importlib that are this far-reaching, so probably types makes the most sense?