Is what you’re proposing more or less complex?
Yeah, in that case its more like having importlib.set_lazy_imports()
at the top of a main()
function :)…
Related thought:
Maybe modules should be able to opt-in/out of lazy imports. Once they do that, then maybe the core Python import system can always do lazy imports of those modules after opt-in. By default all modules opt-out to maintain compatibility.
Almost. -L
catches the case of imports happening before main()
gets called (e.g. .pth
files).
I think they can with the given APIs, but I’m also not sure they should. The PEP authors have made a convincing case (to me anyway) that library authors are in a poor position to be reasoning about their library or dependent library laziness.
Is what you’re proposing more or less complex?
It might! Imagine, for the sake of argument, that it is: that relaxing the “transparency” requirement could allow a simpler implementation.
IMO it does open up some design space that wasn’t explored much.
@encukou, thank you for your comment, it’s indeed a valuable thing to add to PEP.
I agree there is some amount of complexity added to dictionaries. However, I’ve been through two implementations of it (on top of 3.8 and on 3.12, and between the two I saw changes the way dictionaries worked, a lot of them!), and my experience was it gets easier every time. We only need to control how lazy objects can enter and leave a dictionary, and it’s done, more or so, in a simple way that doesn’t add too much burden, in my opinion. Only input and output of key functions need to be controlled.
Yes, we did consider, and even tried implementing the type of approach you are suggesting. We found that it makes the incompatibility barrier too high and too many lazy objects escape where they aren’t supposed to (especially due to the presence of C extensions).
The “transparency” in the implementation is crucial. Even the slightest changes in behaviour can cause bigger than expected incompatibilities, after a lot of iterations this is what we believe to be the right amount of transparency to allow things to work and to be easy to reason about.
Decision on PEP 690 - Lazy Imports
The Python Steering Council has decided to reject PEP 690 on Lazy Imports.
We agree with the widely accepted sentiment that faster Python startup time is desirable. Large command line tools in particular suffer as that is a human user experience. Lazy imports, as proposed, are one of many potential mechanisms that can help with that.
But a problem we deem significant when adding lazy imports as a language feature is that it becomes a split in the community over how imports work. A need to test code both ways in both traditional and lazy import setups arises. It creates a divergence between projects who expect and rely upon import time code execution and those who forbid it. It also introduces the possibility of unexpected import related exceptions occurring in code at the time of first use virtually anywhere. Such exceptions could bubble up from transitive dependency first use in unanticipated places.
A world in which Python only supported imports behaving in a lazy manner would likely be great. But we cannot rewrite history and make that happen. As we do not envision the Python langauge transitioning to a world where lazy imports are the default, let alone only, import behavior. Thus introducing this concept would add complexity to our ecosystem.
We also discussed specific implementation details of this PEP that many of us did not really like, but ultimately decided that even with mere internal implementation changes it didn’t change our broader feeling on the matter: The complexity of fragmentation was bad and we were going to say “no” regardless.
Thank you everybody for the very thorough proposal and multiple #15474 discussions #19661 that helped inform this decision.
signed,
Your Python Steering Council, soon to be vintage 2022 edition
…and yet, for posterity, I would request a summary of the implementation details the SC did not like. Thanks.
Off the top of my head implementation things that came up were…
- Needing to change the core
PyDict
implementation itself solely for a module namespace purpose. Thedk_lazy_imports
flag and related newPyDict_NextWithError
et. al. C APIs. Quite creative, but it felt like a potentially short term hack rather than something we’d be proud of in 5-10 years. - The Python APIs being in
importlib
instead ofsys
came up as feeling unusual. (probably more of a bikeshed)
I’ll let others fill in more on their own if that doesn’t capture it.
Part of me is sad, because this was by far the best version of lazy import I’ve seen. But reading the SC’s explanation I have to agree with their decision.
I wonder if there would be a hook that could be added so that this could be implemented as a 3rd party extension? That would probably be tricky because it would require changing dict objects, which are very fundamental ( they are in fact Python’s most important data strip).
Thanks Greg and SC for considering the PEP, and for the clear rejection reasons.
I’m curious, if anyone is willing to follow up, whether the following changes to the proposed feature and implementation would sufficiently address these concerns to be worth consideration as a new PEP (understanding here that I’m not asking for any official SC statement, just perspectives from individual SC members or anyone else):
- lazy imports only by explicit keyword marker, mitigating both the “fragmentation” and “unexpected error sources” issues since the feature would be always explicitly and locally opt-in by the author of the import statement
- implementation confined to a dict subclass that is used only for module dictionaries, and only if the module contains lazy imports
Thanks,
Carl
Not speaking for the SC, to me a keyword marker is a no-no, since it would be a syntax error for older Python versions. If we’re going to mark up lazy imports specially, it could be a function, e.g. foo = lazy_import("foo")
.
Thinking out loud, maybe after a “bootstrap” like that, the environment in which foo’s code is executed could be altered so that (a) its globals are a special globals dict, and (b) all imports are implicitly lazy?
I appreciate the SC’s reasoned consideration, but I’m also disappointed in the PEP’s rejection. It was the best option so far for solving a common use case, and one that puts pressure on ecosystems to move away from Python.
My gut feeling is this should be a less intrusive implementation. Though I wonder what, if anything, breaks when namespaces start being PyDict subclasses. (Modern optimizations that might skip dict name lookups?)
If it works with minimal issues, it might even be easier to maintain as an add-on patch to CPython than modifying PyDict iself?
This discussion makes me wonder if my use module as globals namespace idea still has merit. The basic idea is that LOAD_GLOBAL does getattr()
on the module object, rather than __getitem__
on the module dict object. That gives you a place to put this lazy loading hook. I originally started down that path when I was trying to implement lazy module loading.
I don’t see this as such a big stumbling block. Yes, the transition wouldn’t be great (mostly for libraries, which have to stay compatible with older python versions pre-3.12 or whenever it lands; though apps could just move to 3.12 directly), but that’s a price worth paying IMO if the alternative is not having lazy imports at all.
Part of me wonders if an alternative here is to encourage imports within the function of use… I mean its sort of against the general idea of putting imports at the top, etc… though: it’s lazy… and i’ve seen it done today in some code bases with a particular module that takes a long time to import.
I’m not really saying this as a good idea, but it was rather thought-provoking to me.
Edit to add: After all: aren’t lazy imports basically syntactic-sugar around only doing the actual import once needed instead of on module import?
Wasn’t this one of the original reasons for the PEP in the first place? I think the original (first one) thread OP mentioned this to some extent.
IMO, a lazy import feature would be nice to have in Python, but in a way which is both easier to understand and safer to manage.
Some ideas in this direction:
- make lazy imports explicit (e.g. use
lazy import abc
instead ofimport abc
) and use a new keyword to make people aware - only work at the module import level, forget about making imported variables lazy (i.e.
lazy import abc
is finefrom abc lazy import xyz
is not) - work with lazy import objects instead of going deep into module dict or object semantics
The above is enough to gain better startup time. Adoption will take a bit longer, since applications will have to start using lazy import
, but it’s worth waiting, since it gives the community more time to adapt and safely start using the new feature.
The implications of such lazy import are sometimes difficult to understand, since imports often do have side-effects (e.g. registering plugins, monkey-patching other code, initializing external libraries/hardware, etc.), so turning to lazy evaluation can result in subtle issues (out-of-order execution, exceptions popping up in places where they are not expected, disabling feature detection via try-import-except, etc.).
PEP 690 goes way beyond the above in many areas (already discussed at length, so I won’t rehash them here), so rejecting it is for the better – sorry, Carl. It was still a good discussion, so thank you for all the work you put into this.
(Replying as an individual developer – you don’t need to convince me, in the SC I try to read community consensus rather than assert my own views.)
I would support that.
Overall, I think we should make it easier for libraries to use lazy imports themselves, à la SciPy or Mercurial.
The current proposal is made for “applications” with tightly controlled set of dependencies. Those are relatively rare in open-source code, and closed-source ones don’t have a good way to report bugs that only appear in a specific setup back to the libraries they’re using. And the libraries can’t test things themselves very well.
The PEP explicitly rejects explicit syntax as fragile “shallow” laziness, but doesn’t really give examples or evidence, just “we tried, it didn’t work”, so it’s difficult to say how well it generalizes outside their use cases. (Apologies for the very rough paraphrasing – it was said much more nicely of course.)
I don’t think lazy imports can be solved without either buy-in from the ecosystem at large (i.e. putting pressure on libraries), or “applications” spending effort to keep -L
working. I worry that the people who’d turn on -L
also can patch their CPython and their dependencies – at least for a few releases.
Porting to explicit lazy imports, library by library, would take time and effort, but might eventually give better results ecosystem-wide.
There might be a parallel with async – gevent’s patching worked immediately for an application, but was more fragile; async
took/takes effort but is ultimately a better solution.
With explicit lazy imports, we could get away with rougher side effects, avoiding too much magic. Dicts could focus on being containers. Code that needs too much introspection or dynamic features simply wouldn’t opt in.
Why not both? Syntax as the preferred future way, and a function for older versions and dynamic imports. (Granted, dynamic imports wouldn’t be too useful outside testing the feature itself.)
A lazy_import
function could totally replace the module dict of the imported module, as well as the __import__
function within that module to do it recursively. This should be doable as pure-Python code as a demo, and then translated to a native extension for speed. The only difference would be that the first import has to be explicit, but since the target is applications anyway, that should be fine.
That said, I like @nas’s “module object as globals” idea too. Presumably a module object that supported __getitem__
would be substitutable for a globals dict, so it might even be a smooth transition, and would certainly allow for more flexibility here.