PEP 690: Lazy Imports Again

barry · November 14, 2022, 7:25pm

Is what you’re proposing more or less complex?

csm10495 · November 14, 2022, 7:28pm

Yeah, in that case its more like having importlib.set_lazy_imports() at the top of a main() function :)…

Related thought:
Maybe modules should be able to opt-in/out of lazy imports. Once they do that, then maybe the core Python import system can always do lazy imports of those modules after opt-in. By default all modules opt-out to maintain compatibility.

barry · November 14, 2022, 7:31pm

Almost. -L catches the case of imports happening before main() gets called (e.g. .pth files).

I think they can with the given APIs, but I’m also not sure they should. The PEP authors have made a convincing case (to me anyway) that library authors are in a poor position to be reasoning about their library or dependent library laziness.

encukou · November 14, 2022, 8:11pm

Is what you’re proposing more or less complex?

It might! Imagine, for the sake of argument, that it is: that relaxing the “transparency” requirement could allow a simpler implementation.
IMO it does open up some design space that wasn’t explored much.

Kronuz · November 15, 2022, 11:01pm

@encukou, thank you for your comment, it’s indeed a valuable thing to add to PEP.

I agree there is some amount of complexity added to dictionaries. However, I’ve been through two implementations of it (on top of 3.8 and on 3.12, and between the two I saw changes the way dictionaries worked, a lot of them!), and my experience was it gets easier every time. We only need to control how lazy objects can enter and leave a dictionary, and it’s done, more or so, in a simple way that doesn’t add too much burden, in my opinion. Only input and output of key functions need to be controlled.

Petr Viktorin:

To make things clearer, consider semantics like the following. (I don’t see anything similar in Rejected ideas, hopefully it wasn’t floated earlier):

import foo creates a global variable __lazy:foo (specially named, but otherwise normal), and sets it to a lazy object. (Another possibility is using a global dict: __lazy__['foo'].)

LOAD_GLOBAL for potentially lazy objects (which are known at compile time) becomes LOAD_LAZY_GLOBAL, which:

tries loading foo, and if it doesn’t succeed:

loads __lazy:foo from globals (not builtins), resolves it and stores the result as foo

deletes lazy:foo

replaces itself with LOAD_GLOBAL, if the specializing machinery allows that

module __getattr__ tries resolving lazy objects the same way

is_lazy_import, eager_imports, set_lazy_imports would work as in the PEP

importlib.resolve_lazy_imports(mod_or_dict) or a globals(resolve_lazy_imports=True)

Yes, we did consider, and even tried implementing the type of approach you are suggesting. We found that it makes the incompatibility barrier too high and too many lazy objects escape where they aren’t supposed to (especially due to the presence of C extensions).

The “transparency” in the implementation is crucial. Even the slightest changes in behaviour can cause bigger than expected incompatibilities, after a lot of iterations this is what we believe to be the right amount of transparency to allow things to work and to be easy to reason about.

gpshead · December 2, 2022, 7:06pm

Decision on PEP 690 - Lazy Imports

The Python Steering Council has decided to reject PEP 690 on Lazy Imports.

We agree with the widely accepted sentiment that faster Python startup time is desirable. Large command line tools in particular suffer as that is a human user experience. Lazy imports, as proposed, are one of many potential mechanisms that can help with that.

But a problem we deem significant when adding lazy imports as a language feature is that it becomes a split in the community over how imports work. A need to test code both ways in both traditional and lazy import setups arises. It creates a divergence between projects who expect and rely upon import time code execution and those who forbid it. It also introduces the possibility of unexpected import related exceptions occurring in code at the time of first use virtually anywhere. Such exceptions could bubble up from transitive dependency first use in unanticipated places.

A world in which Python only supported imports behaving in a lazy manner would likely be great. But we cannot rewrite history and make that happen. As we do not envision the Python langauge transitioning to a world where lazy imports are the default, let alone only, import behavior. Thus introducing this concept would add complexity to our ecosystem.

We also discussed specific implementation details of this PEP that many of us did not really like, but ultimately decided that even with mere internal implementation changes it didn’t change our broader feeling on the matter: The complexity of fragmentation was bad and we were going to say “no” regardless.

Thank you everybody for the very thorough proposal and multiple #15474 discussions #19661 that helped inform this decision.

signed,
Your Python Steering Council, soon to be vintage 2022 edition

barry · December 2, 2022, 7:14pm

…and yet, for posterity, I would request a summary of the implementation details the SC did not like. Thanks.

gpshead · December 2, 2022, 7:58pm

Off the top of my head implementation things that came up were…

Needing to change the core PyDict implementation itself solely for a module namespace purpose. The dk_lazy_imports flag and related new PyDict_NextWithError et. al. C APIs. Quite creative, but it felt like a potentially short term hack rather than something we’d be proud of in 5-10 years.
The Python APIs being in importlib instead of sys came up as feeling unusual. (probably more of a bikeshed)

I’ll let others fill in more on their own if that doesn’t capture it.

guido · December 2, 2022, 10:46pm

Part of me is sad, because this was by far the best version of lazy import I’ve seen. But reading the SC’s explanation I have to agree with their decision.

I wonder if there would be a hook that could be added so that this could be implemented as a 3rd party extension? That would probably be tricky because it would require changing dict objects, which are very fundamental ( they are in fact Python’s most important data strip).

carljm · December 2, 2022, 11:38pm

Thanks Greg and SC for considering the PEP, and for the clear rejection reasons.

I’m curious, if anyone is willing to follow up, whether the following changes to the proposed feature and implementation would sufficiently address these concerns to be worth consideration as a new PEP (understanding here that I’m not asking for any official SC statement, just perspectives from individual SC members or anyone else):

lazy imports only by explicit keyword marker, mitigating both the “fragmentation” and “unexpected error sources” issues since the feature would be always explicitly and locally opt-in by the author of the import statement
implementation confined to a dict subclass that is used only for module dictionaries, and only if the module contains lazy imports

Thanks,

Carl

guido · December 3, 2022, 12:29am

Not speaking for the SC, to me a keyword marker is a no-no, since it would be a syntax error for older Python versions. If we’re going to mark up lazy imports specially, it could be a function, e.g. foo = lazy_import("foo").

Thinking out loud, maybe after a “bootstrap” like that, the environment in which foo’s code is executed could be altered so that (a) its globals are a special globals dict, and (b) all imports are implicitly lazy?

barry · December 3, 2022, 12:53am

I appreciate the SC’s reasoned consideration, but I’m also disappointed in the PEP’s rejection. It was the best option so far for solving a common use case, and one that puts pressure on ecosystems to move away from Python.

gpshead · December 4, 2022, 12:00am

My gut feeling is this should be a less intrusive implementation. Though I wonder what, if anything, breaks when namespaces start being PyDict subclasses. (Modern optimizations that might skip dict name lookups?)

If it works with minimal issues, it might even be easier to maintain as an add-on patch to CPython than modifying PyDict iself?

nas · December 4, 2022, 2:12am

This discussion makes me wonder if my use module as globals namespace idea still has merit. The basic idea is that LOAD_GLOBAL does getattr() on the module object, rather than __getitem__ on the module dict object. That gives you a place to put this lazy loading hook. I originally started down that path when I was trying to implement lazy module loading.

h-vetinari · December 4, 2022, 9:20am

I don’t see this as such a big stumbling block. Yes, the transition wouldn’t be great (mostly for libraries, which have to stay compatible with older python versions pre-3.12 or whenever it lands; though apps could just move to 3.12 directly), but that’s a price worth paying IMO if the alternative is not having lazy imports at all.

csm10495 · December 5, 2022, 12:44am

Part of me wonders if an alternative here is to encourage imports within the function of use… I mean its sort of against the general idea of putting imports at the top, etc… though: it’s lazy… and i’ve seen it done today in some code bases with a particular module that takes a long time to import.

I’m not really saying this as a good idea, but it was rather thought-provoking to me.

Edit to add: After all: aren’t lazy imports basically syntactic-sugar around only doing the actual import once needed instead of on module import?

Melendowski · December 5, 2022, 12:48am

Wasn’t this one of the original reasons for the PEP in the first place? I think the original (first one) thread OP mentioned this to some extent.

malemburg · December 5, 2022, 11:46am

IMO, a lazy import feature would be nice to have in Python, but in a way which is both easier to understand and safer to manage.

Some ideas in this direction:

make lazy imports explicit (e.g. use lazy import abcinstead of import abc) and use a new keyword to make people aware
only work at the module import level, forget about making imported variables lazy (i.e. lazy import abc is fine from abc lazy import xyz is not)
work with lazy import objects instead of going deep into module dict or object semantics

The above is enough to gain better startup time. Adoption will take a bit longer, since applications will have to start using lazy import, but it’s worth waiting, since it gives the community more time to adapt and safely start using the new feature.

The implications of such lazy import are sometimes difficult to understand, since imports often do have side-effects (e.g. registering plugins, monkey-patching other code, initializing external libraries/hardware, etc.), so turning to lazy evaluation can result in subtle issues (out-of-order execution, exceptions popping up in places where they are not expected, disabling feature detection via try-import-except, etc.).

PEP 690 goes way beyond the above in many areas (already discussed at length, so I won’t rehash them here), so rejecting it is for the better – sorry, Carl. It was still a good discussion, so thank you for all the work you put into this.

encukou · December 5, 2022, 12:26pm

(Replying as an individual developer – you don’t need to convince me, in the SC I try to read community consensus rather than assert my own views.)

I would support that.
Overall, I think we should make it easier for libraries to use lazy imports themselves, à la SciPy or Mercurial.
The current proposal is made for “applications” with tightly controlled set of dependencies. Those are relatively rare in open-source code, and closed-source ones don’t have a good way to report bugs that only appear in a specific setup back to the libraries they’re using. And the libraries can’t test things themselves very well.
The PEP explicitly rejects explicit syntax as fragile “shallow” laziness, but doesn’t really give examples or evidence, just “we tried, it didn’t work”, so it’s difficult to say how well it generalizes outside their use cases. (Apologies for the very rough paraphrasing – it was said much more nicely of course.)

I don’t think lazy imports can be solved without either buy-in from the ecosystem at large (i.e. putting pressure on libraries), or “applications” spending effort to keep -L working. I worry that the people who’d turn on -L also can patch their CPython and their dependencies – at least for a few releases.

Porting to explicit lazy imports, library by library, would take time and effort, but might eventually give better results ecosystem-wide.
There might be a parallel with async – gevent’s patching worked immediately for an application, but was more fragile; async took/takes effort but is ultimately a better solution.

With explicit lazy imports, we could get away with rougher side effects, avoiding too much magic. Dicts could focus on being containers. Code that needs too much introspection or dynamic features simply wouldn’t opt in.

Why not both? Syntax as the preferred future way, and a function for older versions and dynamic imports. (Granted, dynamic imports wouldn’t be too useful outside testing the feature itself.)

steve.dower · December 5, 2022, 8:46pm

A lazy_import function could totally replace the module dict of the imported module, as well as the __import__ function within that module to do it recursively. This should be doable as pure-Python code as a demo, and then translated to a native extension for speed. The only difference would be that the first import has to be explicit, but since the target is applications anyway, that should be fine.

That said, I like @nas’s “module object as globals” idea too. Presumably a module object that supported __getitem__ would be substitutable for a globals dict, so it might even be a smooth transition, and would certainly allow for more flexibility here.