PEP 690: Lazy Imports Again

NeilGirdhar · December 7, 2022, 6:27pm

Would it be possible to simplify re-engineering this “otherwise valid code” to support lazy imports? It would be really good to get a list of examples so that we can see if there are better solutions.

Looking at the above comments, Marc-Andre mentioned registering plugins, monkey-patching, and initializing hardware. All of these things could be made explicit behind a function call rather than implicitly on library import, and that might honestly be better design in some cases. In particular, registering plugins and monkey-patching require a certain import order, which is highly unintuitive since many Python programs use tools like isort to sort imports.

I think it’s a good point to say that lazy imports break code, and that’s not worth the optimization benefits. However, how easy is the code to fix and can we make it easier to fix? And are lazy imports breaking mostly poorly-designed code?

brettcannon · December 7, 2022, 8:35pm

Quite honestly, if I got to redesign import we would only allow for importing to the module.

carljm · December 7, 2022, 10:56pm

And I don’t think I’d argue with you It would result in a lot fewer people confounded by “where to patch” issues in tests…

csm10495 · December 8, 2022, 5:49am

Naive thought:

Could a enum of information be added towards the top of compiled pyc files? Say that the compiler was able to determine side effects like what would be modified in module scope (or other module scope). Check recursively with used modules… if nothing fishy is set or patched, flag that this can be lazy loaded next time (in the pyc)… Then do something similar for frozen modules, etc.

It wouldn’t be as performant as always being lazy, etc. But it could potentially work for all modules and give performance boosts to existing imports.

This way modules would self describe if they can be safely lazy loaded on compilation. No extra work for maintainers or users.

(We could have ways to force lazy or eager too for testing but generally it wouldn’t be the way folks would do things since regular imports would just work as expected.)

encukou · December 8, 2022, 10:04am

Not different at all! (edit: assuming we are talking about PEP-690-style global lazy import mode)
I wish there was a way to support various platforms without adding a new dimension to the CI matrix.

Explicit lazy loading means all your users will import the library the same way. It’ll be a different way, probably a more complex way, but it replaces the old way. All your users (and CI runs) are now testing the new code paths.

steven.daprano · December 7, 2022, 2:00pm

As a user, I have no problem with from module import spam always being eager. My mental model of from module import... is that it is (almost) like

import module

spam = module.spam

so that seems pretty eager to me.

If you were to ask me what it means, conceptually, for from module import spam to be lazy, I would struggle to give a coherent explanation. How does the interpreter even know that spam exists, without eagerly loading the module?

If all names had to be declared before they could be used, then

the interpreter could just look at the declaration; and
it wouldn’t be Python

I would be okay with a relatively simple model for lazy importing:

Only modules and packages may be lazy: import module and import package.module may be lazy.
from ... import ... is always eager.
And non-module objects in the module cache are never lazy.


sys.modules['eggs'] = 1234



# later

import eggs  # always eager

steven.daprano · December 7, 2022, 2:12pm

How do you reason that?

The library author does not and can not know the caller’s requirements. It is the caller, not the library, who knows whether eagerly loading the library at startup is too expensive. It is the caller, not the library, that can choose to delay loading the library by moving the import into a function.

brettcannon · December 8, 2022, 10:40pm

All true, but it’s the library that knows whether it can be imported lazily or not based on how the code in the module is written. I don’t think anyone has discussed requiring lazy importing if a module declares it supports such a situation.

Glenn · December 9, 2022, 12:14am

As an email user on Discourse, I’ve noticed that quoting seems to get
stripped. Apologies if that happens again.

[steven.daprano] Steven D’Aprano
https://discuss.python.org/u/steven.daprano steven.daprano
December 7
Pushing the lazy choice to the library maintainers sounds correct.
How do you reason that?

The library is either tiny or large. If tiny, there is little
difference between eager and lazy loads regarding performance.

If large, then whatever needs to be done to defer the cost past startup
seems worth the cost. But it is the library author that has to code
support for lazy loading. What I hear in the discussion is that lazy
loading changes semantics somewhat, and the library may not work if
loaded lazily, apparently due to several technical issues, some of which
seem to be able to be worked around externally, and some not (or the
cost of doing so is too high in complexity or performance). Hence, the
attempt to lazy load some libraries has been stated to fail, and to
cause library maintainers to be potentially inundated with feature
requests to support lazy loading.

If it were the library maintainers that made the decision to do the work
to support lazy loading (automatically, instead of by application
request), then they could promote that as a competitive advantage versus
libraries that do not support lazy loading.

It also would mean that no changes would be necessary to the
applications to benefit from the performance gains, other than using the
upgraded version of the library that supports lazy loading, which is
simple. There has been much discussion about needing to update all the
applications with the new syntax, for various lazy loading proposals,
and that applications wouldn’t all benefit.

The library author does not and can not know the caller’s requirements.
It is the caller, not the library, who knows whether eagerly loading the
library at startup is too expensive. It is the caller, not the library,
that can choose to delay loading the library by moving the import into a
function.

The library knows how big it is, and how expensive it is to have itself
loaded, and whether or not lazy loading would be a benefit.

barry · December 9, 2022, 3:41am

Size isn’t the only factor determining import time. It’s also the amount of work done at module global scope that contributes to import time (sometimes that even overwhelms actual import time). Searching for the module probably also contributes. It would be interesting to know what factors contribute in what ratios for any particular import.

This sense of a library being able to explicitly opt out of lazy importing could be interesting. If there was a lost cost way of marking a module as lazy-import-unfriendly, then it would always be eagerly imported regardless of the status of the lazy import flag. The default would be lazy-import-friendly. That doesn’t solve the unexpected-exceptions-at-first use problem, but maybe that could point a way forward for that particular issue.

barry · December 9, 2022, 3:42am

Maybe it’s the opposite: a library could know that it can’t be lazily imported, but it might be difficult to analyze whether it can be lazily imported.

bryevdv · December 9, 2022, 5:03am

Is there a simple way that library authors can actually test this approach out? Bokeh is a cross-runtime library that registers all models for serialization in a metaclass on import so that everything is “ready to go” and so that new models originating in the other runtime can be received correctly at all, for that matter. I don’t really want to inflict obligatory but easy-to-forget, junk boilerplate “init” function calls on every user of Bokeh in every instance of usage. Will lazy imports work for us? I am inclined to think not, but maybe I am wrong. Really, I have no idea, but I’d be curious to run the experiment.

As an aside, I can’t say I love some of the comments in this thread that seem to imply that using a language behavior that has been established since forever, by choice, is now suddenly tantamount to “bad code” just because a new idea has friction with it, or that maintainers who have concerns that a switch to lazy imports would make their users (or themselves) suffer “don’t want to do the work”.

petersuter · December 9, 2022, 7:37am

Is indirect (deep / recursive / forced / implicit / global) lazy import still considered? Or only direct (shallow / non-recursive / opt-in / explicit / local) lazy import?

I.e. lazy import x in my code will only affect my code, not the indirect imports inside x?

This sounds reasonable then.

lazy import x   # _not_ affecting indirect imports inside x

def f():
    x.y()

Or is there library code that would break when lazy imported as above (no indirect lazy import) that is not already broken today like this:

def f():
    import x
    x.y()

Glenn · December 9, 2022, 7:57am

[barry] Barry Warsaw https://discuss.python.org/u/barry barry CPython
core developer
December 9
Glenn:

The library is either tiny or large. If tiny, there is little
difference between eager and lazy loads regarding performance.
Size isn’t the only factor determining import time. It’s also the amount
of work done at module global scope that contributes to import time
(sometimes that even overwhelms actual import time). Searching for the
module probably also contributes. It would be interesting to know what
factors contribute in what ratios for any particular import.

Very true. “Tiny” and “large” can apply to space or time, of course.
But as you point out, analyzing all the factors is relevant.

EpicWink · December 9, 2022, 10:50pm

You wouldn’t have to: just call your init function in your top-level __init__.py file.

I don’t think anything is still considered: the steering council rejected this proposal, and multiple people have said that this was the best proposal for lazy imports yet. I don’t think we will ever have a better proposal (one that satisfies all stakeholders).

brettcannon · December 9, 2022, 11:09pm

If your code that does anything as a side-effect of import which influences code outside of itself then lazy imports would break you. I.e. if you do import A to influence module B, then the lazy import both modules may make B import before A does.

With the syntactic proposal that some have suggested, that’s correct. But there’s no real proposal at the moment as the PEP this discussion is happening on was rejected. No one has written up a new PEP to propose something different, so really who knows what it would do?