First of all, given the tone of your response, I apologize if my post came across as adversarial or as noise in the discussion, particularly for anyone who is heavily involved in the discussion. My interest here is to help make sure we do what’s best for the language and the community, by adding my insight on a topic on which I’m an expert. My focus is mostly on communicating my point of view to the PEP authors.
Sorry I wasn’t more clear. My expectation is that eventually all imports will be lazy, whether by default or explicitly set globally, with the exception of a handful of modules. At that point the new keyword will be irrelevant and thus just noise. (I guess my point about __lazy_modules__would be invalid.)
FWIW, the “lazy” keyword feels like we’re adding syntax solely for the benefit of the toolchain (Python compiler) without much benefit to humans. Many programming languages do that, but the Python language has been mostly motivated by the benefit to humans and has little/no syntax solely for the benefit of the toolchain. All that said, I’m sure I’ve missed some benefit of the new keyword for human readers, in which case I would withdraw my position.
My concern about “is this import actually lazy?” is valid, regardless of anything else, so let’s not lose that point in the midst of any contrasting point.
We have to be diligent about not adding unnecessary/excessive complexity for users (and internally to an extent) to get some desirable feature, honoring the spirit of minimal new complexity as part of a healthy balance that doesn’t cross the hard-to-define threshold of “too much complexity”. Hence I pointed out new complexity that I did not see explicitly discussed (and directly justified) in the PEP.
(I do not have a position at the moment on whether or not this particular extra complexity justified, but I do suggest we mitigate or eliminate it if possible.)
Like I said originally:
I get the downsides of lazy module attrs and expect the whole concept is potentially independent of (and more broadly applicable then) lazy imports. I do think lazy evaluation is desirable, not just for module attrs, but it doesn’t have to be part of lazy imports.
The point of lazy module attrs would be to avoid executing the part of the module you don’t need. This matters for modules that have lots of classes, functions, etc. Thus I expect the alternative to lazy module attrs will be to split large modules up. We’ll end up with more some extra import statements and bound names, but the imports would likely be lazy so it wouldn’t matter. I hope users do that regardless.
Ultimately, when someone imports something, they expect it to be bound to a name and they don’t care about the actual module until they go to access one of its attrs. (Modules with side effects are exceptional; I’ll add a note about that in a moment.) Lazy imports, including as defined in the PEP, don’t violate this expectation. The same would be true for lazy module attrs, though that point is irrelevant to this PEP.
Notably, the (uncommon) modules that have import-time side effects violate the typical expectations regarding imports. As I mentioned originally, I think we should explore the needs of users that introduce such import side effects in conjunction with efforts toward lazy imports.
You could be right, but I’d say the hypothetical complexity shouldn’t get in the way of figuring out if there is a way forward in the face of the potential actual complexity.
I find this surprising. Is this something that has been benchmarked well? Also, CPU time isn’t the only benefit of lazy imports–memory savings are potentially a significant benefit.
I agree that that tradeoff is meaningful and the benefit is desirable. However, I’m suggesting that we be sure we are confident about the tradeoff and that we’ve made every effort to mitigate the negatives. It isn’t clear that’s the case yet.
My apologies. I’m mostly only in a position right to consider what is in the PEP. I hope all that insight and analysis will be carefully captured in PEP 810 as soon as possible. That information is super valuable and essential for decision-making.
Ah, I’m sorry then. I didn’t notice more than a mention of the topic, which I called out originally, and didn’t see the deeper analysis. I’ll go review the PEP again. The key to me is that we have the confidence that we understand the problem space in order to make good decisions. (It wasn’t clear from the PEP when I read it that we do.)
A key part of the PEP process is for the proposal to capture the discussion. The process is not meant to exclude people who aren’t able to follow all the discussion, particularly for discussions that span hundreds of comments. Please be respectful of different people being in different situations. The decision-making isn’t just for the people that have the time to fully participate in discussions. I’m doing the best I can to provide what I thought was helpful feedback. I certainly don’t want to waste anyone’s time.
IMHO, it’s better to have someone express a concern even if it’s already been covered than for them to keep to themselves for fear of someone complaining about noise and then miss something important.
FYI, I’m a long-time core developer and help maintain the import system specifically. I’m intimately familiar with how it all works. I haven’t been involved in many PEP discussions lately as I haven’t had time and didn’t want to add noise. However, I felt like it was important that I chime in on this one given my expertise.
Another exceptional case I’ll point out relative to lazy imports is with importers. Are the expectations of importer authors/users honored by the proposed lazy imports? I didn’t notice anything in the PEP about that.
Agreed. That’s why we should be clear about different ways and reasons why people execute code in their modules. It’s isn’t as simple as “don’t do that” or even disallowing it.
That’s definitely something that the PEP must be clear about. I think it does well enough, between the dedicated keyword indicating “something’s different” and the exception chaining providing clear tracebacks. I think the “spooky action at a distance”, relative to the actual import statement, is okay.
There’s the question of falling back to a different module (which gets a bit clunkier) that the PEP could be more clear about. I’m guessing it will boil down, for now, to suggesting moving the fallback try-except to a separate module which can itself be lazy imported. It would be nice if there were something in the PEP to actually reduce friction there.
This just doesn’t match reality. It’s not “a handful” of modules with side effects. It’s everywhere. Real Python code is full of this stuff: logging setup, plugin registration, global state initialization, database connections, framework configuration. When the companies that are participating in the discussion tried to roll out something like PEP 690’s global approach, they hit a wall. Even with total control over their codebases (Meta has a monorepo!), the effort was massive. They needed complex filtering systems just to keep things working.
PEP 690 got rejected for good reason.
I don’t know where are you getting this from. The comment you are answering has very good reasons that you have dismissed as “it just helps the compiler”. Explicit beats implicit. When you see lazy import foo, you know immediately that something’s different. You know side effects are deferred. You know ImportErrors might show up later. Without that keyword, you’re left guessing based on some config file buried somewhere else. it’s also about responsibility. The person writing lazy import owns that decision and its consequences. In a world where everything might be lazy based on a global flag, nobody knows what’s going on without checking external state.
I think is fair to say that his has been benchmarked extensively and there is more than enough evidence. It’s also not difficult to realise why this could be the case. For example, the issue shows up on NFS-backed filesystems and distributed storage, where each stat() call has network latency. In production environments, you can see 50-200ms per stat() call depending on network conditions. When you have dozens of imports and each one does multiple filesystem checks traversing sys.path, you burn through seconds just finding modules before executing any Python code. In some measurements, spec finding accounts for 60-70% of total import time.
Memory savings are absolutely significant too. But the I/O cost is often the single biggest bottleneck. The folks at Bloomberg, Google, Meta, and HRT probably have similar stories. There were some links shared in the PEP and the discussion about that.
From what I can see, the community did explore alternatives, including the scientific Python approach and the LazyLoader approach (eager spec lookup, lazy execution). That’s in the rejected ideas section, though maybe it needs more detail on why they chose full laziness. The fact that both the author of LazyLoader and the authors of the scientific Python solutions are backing the PEP should also prove the merits of the proposal.
The negatives of not doing cheap existence checking are real. But the mitigation seems straightforward: importlib.util.find_spec() works for that use case, and it’s explicit about what you’re doing. If you need to check whether a module exists without importing it, lazy imports probably aren’t the right tool anyway. The semantics get confusing (what does lazy in try/except even mean?).
The alternative (eager spec, lazy execution) gives you existence checking but loses most of the performance win on network filesystems and other similar situations. From the discussion and the production deployments mentioned, it looks like they’re trading a use case that has a clear workaround for performance gains that would otherwise be impossible. The users that have tried similar solutions clearly care about startup time, and they chose full laziness after trying both approaches.
The PEP doesn’t talk about the import protocol internals much because it builds on the outermost part of that protocol (the __import__ hook).
So from the import system’s PoV, it’s as if there was an inline import right before every access to the lazily imported names.
The key reason I see for leaving them out is because they’re pointless. Unlike module level imports, which are often for things the module might use (making laziness potentially useful), inline imports are only added for names the function is actually going to use, so laziness adds overhead for no benefit.
So I would say, and I believe the PEP authors have said things like this elsewhere in the thread, that we should keep this PEP focused on the minimum viable set of things that are required for this to work. I think an ergonomic pattern for this does need to exist (and “wrap it in another import” is not that), but I also think that there are ways to do this that do not complicate the existing PEP. Example:
I believe we can add a lazy_fallback in an issue. Similarly if we find that a lot of people are chafing from a lack of the ability to handle exceptions centrally for these lazy things, it would probably not be terribly difficult to add (in a later PEP or an issue) the ability to add a “callback” that wraps the import, e.g.:
This seems too complex to design in this particular PEP, and also unnecessary to do until we find that no one has good ergonomic solutions to this, but also it could be added in a fully backwards-compatible way pretty easily.
Yeah I actually think the lazy keyword is pretty important here.
Like when your reading code, seeing lazy import foo makes it very clear somethings different happening. Without it you gotta go hunt for some __lazy_modules__ list at the top or check what flags are set. Thats annoying and error-prone. Imagine you got 100 imports at the top of your file and somewhere above theres __lazy_modules__ = ["foo", "bar", "baz", ...]. Now your scrolling through trying to figure out which ones are lazy and which arent. Thats terrible for future maintainability. In 6 months when someone else reads that code they gotta constantly reference that list. With the keyword its immediately obvious which imports have the special behavior.
The __lazy_modules__ thing is fine as a compatibility mechanism to help people migrate from older Python versions or bridge between there custom solutions. Thats useful. But thats it: it shouldn’t be the main way people use this feature long term.
Yes, a “fallback” clause to the import statements had been brought up several times earlier in this thread and I think it’ll be awesome if there’s a declarative way to address the most common use case of importing a drop-in replacement when the preferred one fails.
Something like:
lazy import newfoo or foo as foo
lazy from newfoo or foo import bar
lazy from newfoo import newbar or from foo import bar as bar
which imports a drop-in replacement at reification time.
Since this “fallback” clause can benefit eager imports too, it probably belongs to a separate PEP.
I’m not sure how a utility function can work since by calling importlib.lazy_fallback(my_module, backup_module) you’ve reified both my_module and backup_module already.
The core feature is good and solves the problems people are havling. Adding fallback mechanisms and exception handling callbacks and all that stuff just complicates things before we even know if people actually need it in practice. Ship the minimal version first, see how people use it, then add the ergonomic helpers if they turn out to be necessary. We have done thgis all the time in the past.
The lazy_fallback idea or exception callbacks could definitely be added later in a backwards-compatible way. No need to design all that now and risk getting it wrong or making the PEP harder to accept. Get the foundation right first.
I agree that solutions to that friction point can be built on top of this proposal and can be address separately afterward. That seems reasonable. I’m still wrapping my brain around other corner cases, before I draw deeper conclusions.
My mistake, I thought the proxy object didn’t reify until you actually tried to use a property of it. I suppose you are stuck with the helper function looking like this:
This doesn’t match my expectations. I think there are cases in which you really want eager imports for their performance characteristics and error semantics.
Suppose you are running a server of some kind. It takes time to start, but after that it serves thousands of requests. Ideally, it would not contain any lazy imports, since that represents a cost which must be paid during a request, rather than on startup, and could result in a failure. If I were particularly invested in ensuring that the first request served does not incur an extra cost, I could walk all lazy modules and force reification.
I also maintain CLIs where lazy imports are an easy win. So I want it both ways.
Do you feel that the PEP has to justify its position that the feature should be opt in with examples of cases where that’s better? I think the referenced rejection of PEP 690 is more than sufficient for this purpose. Does the PEP need to explain it in more detail?
(I do not think it needs to do so. Referring to a rejected proposal is quite strong as rationale for taking a different approach.)
I agree that an enormous number of use cases will be helped by or at worst not hurt by all imports becoming lazy. But I still want granular control as a user. From my perspective, you’ve incorrectly extrapolated from the majority case into the 100% case.
We (the authors of PEP 810) wanted to take a moment to say thank you all for the level of engagement in this discussion. Seeing so many of you care this deeply about making this feature better or to raise concerns about it is very important for us, and we are deeply humbled by the participation.
We need to ask for your understanding on something though. The volume of feedback has been absolutely massive. Between this thread private emails, and other channels, we’re getting hundreds of comments and suggestions. I’m personally spending over 3 hours every single day just trying to keep up and make sure we’re capturing the important stuff and we keep the PEP up to date and making the web demo. We’re constantly updating the PEP based on what we’re hearing, but please know that if we don’t respond to your specific comment, it doesn’t mean we didn’t read it or that it wasn’t valuable. We’re reading everything. We just physically can’t reply to every single point.
So please understand if we try to focus the discussion towards convergence. If we try to solve every possible use case and address every extension idea right now, this discussion will never converge. The PEP will become unwieldy, harder to review, and honestly, harder to implement well. We already have an embarrassingly long list of “rejected ideas”, “deferred ideas” and “alternative ideas” So we’re taking the approach of nailing the core feature set first and then allowing people to build on top of it later.
With that in mind, we’re closing the discussion on a few categories of suggestions:
New syntax in any form. We consider that the syntax is settled. We’ve thought about it extensively, we’ve considered the alternatives, and we’re confident in what we have.
New semantics on top of the core proposal. Things like lazy attributes, alternative semantics for lazy imports inside try/except blocks, special fallback clauses, and similar extensions. These are interesting ideas, but they belong in future work, not this PEP.
Lazy imports in try/except blocks. We’ve decided to forbid them. The semantics would be too complex and surprising (exception handlers executing at a distance from where they’re defined), and there are reasonable workarounds for the common use cases. Additionally, we don’t want to deviate from the global activation where we need to make it eager for compatibility.
Anything else that can be built on top later. If it’s not essential to making the basic lazy import mechanism work, we’re leaving it for follow-up work.
We know some of you have suggestions you’re really excited about. We get it, and we genuinely appreciate the creativity and effort behind them. But if we try to be everything to everyone right now, we’ll end up with nothing. Let’s ship something solid that solves the real problems people are facing, and then we can iterate and extend based on actual usage patterns.
Please understand we can’t put every single permutation of ideas into the “rejected” section. If we tried to document why we rejected every variation and combination that’s been proposed, the PEP would be unreadable. We’re focusing on the major alternatives we seriously considered.
That said, we do need your input on some genuinely open questions where we haven’t made a final decision yet:
Should there be a way to access module __dict__ without reification from outside the module* Something like __raw_dict__ has been suggested for introspection use cases. Any other options
Should we include something like a free function to force reification on objects (or the entire module) instead?
Thanks again for being part of this. Your energy and input make a difference!
The enabled parameter can be None (respect lazy keyword only), True (force all imports to be potentially lazy), or False (force all imports to be eager).
The global lazy imports flag can be controlled through:
The -X lazy_imports=<mode> command-line option
The PYTHON_LAZY_IMPORTS=<mode> environment variable
The importlib.set_lazy_imports(mode) function (primarily for testing)
Where <mode> can be:
“normal” (or unset): Only explicitly marked lazy imports are lazy
“all”: All module-level imports (except in try or with blocks and import *) become potentially lazy
“none”: No imports are lazy, even those explicitly marked with lazy keyword
I’m guessing the former is correct; so it should be “normal”/“all”/“none” for command-line and environment variable, but None/True/False in the Python function?
If so, it could in some situations be confusing to have -X lazy_imports=none mean something completely different from set_lazy_imports(None). Perhaps “always”/“never” would be better than “all”/“none”, to avoid that clash?
When accessing my_module.__dict__, is the order in which lazy modules are reified guaranteed to be the same as the order of lazy imports in the code?
Making CLI tools invoked with --help run quickly is explicitly noted as an example in the Motivation section of the PEP. Having reification order preserved when accessing __dict__ would make it easier to do that even in the presence of imports with side effects, like in the following example:
import argparse
import sys
lazy import my_first_module # import this first
lazy import my_second_module # import this second
lazy import my_other_module # import this after all side effects have happened
def main(args):
sys.modules[__name__].__dict__ # <- Is order preserved here?
# do some work here ...
if __name__ == "__main__":
parser = argparse.ArgumentParser()
# ... add arguments that don’t require my_*_module
args = parser.parse_args()
main(args)
From initial testing in the interactive demo, order seems to be preserved; this also matches my mental model based on a naïve implementation.[1] So if there are cases where the order is not conserved—or if the order should be considered an implementation detail for now, so users don’t rely on it—I think it would be good to explicitly state that in the Specification.
Regarding placement of lazy:
One of my favourite things about Python is that many language constructs read like plain English; they just feel so natural.[2] The construct lazy from ... import ... really breaks that expectation—even after sleeping over it for a few nights and seeing it in use many times (in the PEP and this whole discussion thread), I’m not getting used to it.
I completely understand that you don’t want to propose from ... lazy import ... as the baseline option—this discussion thread is already active enough, even without adding those backwards compatibility concerns! But I’m not clear what the best way to proceed is for now: Should we wait for a preliminary opinion of the Steering Council? Open a separate thread to discuss a transition plan? Start drafting a separate PEP describing the necessary deprecations before opening a discussion on that?[3]
And last but certainly not least: Many thanks to everyone involved for your work on designing this feature, implementing it, writing up a very detailed and well-explained PEP and now collating tons of feedback and being so responsive! It’s an exciting feature that will make my life easier and help me develop more responsive software packages for my own users, so I’m really looking forward to being able to use this!
Any imports (lazy or eager) are added to __dict__ in the order they appear in the code. Since dicts conserve order, iterating over the __dict__ would then return lazy modules in the same order for reification. ↩︎
Incidentally, that’s why I prefer lazy over defer as a keyword. Talking about a “lazy import” feels natural, while a “deferred import” feels a bit more like technical jargon. ↩︎
And perhaps that section in the PEP 810 text could be moved from “Rejected Ideas” to “Alternate Implementation Ideas” in the meantime? ↩︎
Thanks for your comment, and I really do understand where you’re coming from. However, we’ve made our decision on the syntax after extensive consideration, and we need to ask you to respect that. If you strongly disagree, you’re absolutely free to pursue alternatives, but I’m going to respectfully ask that you don’t open parallel threads. This is already difficult enough to coordinate, and fragmenting the discussion would make it much harder for everyone, including the Steering Council. If you feel the syntax issue is critical, I’d suggest submitting an issue in the SC bug tracker so they can make a preliminary ruling.
PEP 810 as written is a simple, well-thought out proposal. I want to tell the interpreter a name is a module without having it freak out when it parses the source.
import sys
print('json' in sys.modules)
_ = json.dumps("hello")
causes
False
Traceback (most recent call last):
File "/git/lazy/examplen.py", line 3, in <module>
_ = json.dumps("hello")
^^^^
NameError: name 'json' is not defined. Did you forget to import 'json'?
With lazy import, it doesn’t.
The bulk of the suggestions / comments / requests strike me as impractical … I want to know the names in the module without the import overhead!
From PEP 484:
It should also be emphasized that Python will remain a dynamically typed language, and the authors have no desire to ever make type hints mandatory, even by convention.
Sometimes ya don’t know if a object.attribute will resolve until runtime. That’s a feature, not a bug. Sometimes you don’t know the type beyond “Any.” There is nothing wrong with that.
Answering the questions:
Shrug. I don’t see any need for it. I almost always put my imports at the top of the source file.
No. You have to import the module to know what’s in the module.
Yes, the order follows naturally from dict iteration semantics. When you access my_module.__dict__, the lazy modules are reified in the order they appear in the dict as it’s being iterated over. This isn’t special semantics we’re imposing on top of the feature: it’s just following the standard order in which the dict is created and accessed. Python dicts are ordered (since 3.7), so the lazy imports will reify in the order they were defined in the source code when you iterate over __dict__.
My apologies. Those details weren’t clear to me when I read the PEP. It would be helpful to have a clear enumeration of the specific use cases for import side effects, along with consideration of possible reasonable alternatives to each that doesn’t incur those side effects at import time. That information would be supremely valuable in deciding what to do about lazy imports. I genuinely understand if the PEP authors feel like a deep analysis on this point is not a good use of their time. I’ve been there. I’m only concerned that we might not have as much information as we should in order to make a good decision about this proposal.
My earlier point was based on the idea that, from my perspective, import side effects should be something we move away from. I should have been more clear. In my experience, import side effects lead to higher code maintenance costs, especially making some functionality less discoverable. Are there use cases where the nicest approach, maintainability-wise, actually involves import side effects? (I’m asking honestly.) I’ll note that the PEP does effectively say, “if you have import side effects (e.g. registration) and you’re worried about someone lazy importing your module then you need to stop having import side effects.” I agree with that. However, I’d like to know we have good alternatives to offer them, in addition to the excellent few recommendations the PEP offers. I’d be surprised if all the use cases for import side effects are covered by those 3 suggestions.
Regardless, none of this necessarily needs to block this PEP. However, modules with import side effects are very fundamental to the topic of lazy imports, so extra clarify about them in the PEP (and deep consideration of how to avoid them) seems warranted. Let’s be careful to minimize the extra burden we would effectively add to maintainers of such projects.
Relatedly, if a module has import side effects and another module does a lazy import of it then aren’t we back in the same problematic situation? Is there some mechanism to declare modules that currently have import side effects? Otherwise you’d always have to know, when you’re importing a module, if it has import side effects or else things might break in hard-to-debug ways. That would be a pain.
I guess I’ve misunderstood about this then. My understanding was that the “lazy” keyword actually means “maybe lazy and maybe not”. There are global mechanisms for opting in or out (as well as the per-module __lazy_modules__). That would mean you can only guess if a “lazy” import is actually lazy (or that an eager import actually isn’t) until you hunt down anywhere that might have engaged the global mechanism. Have I misunderstood?
Is it just the stat calls that significantly slow do finding a module? If so, the impact (and thus benchmarks) correlates to the size of you sys.path and sys.path_importer_cache, which can depend on a number of factors. Presumably any benchmarks reflect the wide variety of Python users.
(I’m not saying that the stat calls are only a problem for only some small set of users. Rather I’m saying it isn’t clear to me that we’re doing more than extrapolating based on educated guesses relative to a potentially non-representative sample of the actual Python community. That sort of best effort is sometimes all we can do, but I’m suggesting that we should make sure we aren’t being complacent about the rigor that PEPs–and the community–deserve.)
Regardless, isn’t there something we could do to mitigate the impact of stat calls when finding modules? For example, could we add the explicit option of checking less frequently to see if sys.path_importer_cache entries have been modified on disk (or the option to not check at all)?
If something like that meant we could do the finder portion of import first then that would resolve a number of rough corners for this PEP.
I agree that people have tried different solutions. Certainly the effort may have covered the possibilities sufficiently, but maybe not. However, the PEP doesn’t talk much about why imports are slow, and only gives a vague sentence in the rejected ideas section asserting that finding a module is slow due to filesystem access. It doesn’t discuss why nor what could be done to mitigate that cost. So it isn’t clear to me, in the proposal, that we deserve much confidence about having taken seriously the idea of only making loading lazy.