Alternative path for explicit lazy imports

Recent lazy import PEP810 has been submitted.
This thread is to showcase an alternative path and compare the 2.



1. How to teach this (and the way it actually works)

The workings are simple:

  1. Any executed imports (lazy or not) live in sys.modules
  2. Any import (lazy or not) first checks sys.modules before doing any work
  3. Thus, the user / developer can always observe and adapt to the current state of affairs


2. Showcase

NOTE, import statements can be easily made to reify previous lazy imports. It is only an option to propagate them, can be done either / both ways - doing things this way allows a lot of flexibility and customisation of behaviour. It is just the one that turned out to be in initial concept.

from importlib.util import *
turtle = lazy_import('turtle')
print(type(turtle))     # lazymodule

# Any further non-eager imports simply pick up `lazyimport`
# from `sys.modules`

import turtle as turtle2
print(type(turtle2))     # lazymodule

# `from module import attribute`
# reifies any pending lazymodules

from turtle import up
print(type(turtle))      # module
print(type(turtle2))     # module

This provides a mechanism to set lazy imports at the main application,
and all subsequent dependencies will simply pick them up from sys.modules:

# main.py
from importlib.util import *
tkinter = lazy_import('tkinter')
import submodule        # @ submodule: lazymodule
submodule.foo()         # I am not using tkinter

# tkinter is still lazy
print(type(tkinter))    # lazymodule

submodule.bar()         # I have used tkinter
# Not anymore
print(type(tkinter))    # module
# submodule.py
import tkinter

def foo():
    print('I am not using tkinter')

def bar():
    tkinter.ACTIVE
    print('I have used tkinter')

print('@ submodule', type(tkinter))

And if a package wants to enforce eager import, then it can be done explicitly:

from importlib.util import *
tkinter = lazy_import('tkinter')
print(type(tkinter))            # lazymodule

numpy = eager_import('tkinter')
print(type(tkinter))            # module

Parent packages (if not already imported) are recursively made lazy as well:

from importlib.util import *
tkinter_ttk = lazy_import('tkinter.ttk')
print(type(tkinter_ttk))     # lazymodule

import tkinter
print(type(tkinter))         # lazymodule

Error handling:

In sync with non-lazy imports:

import sys
from importlib.util import *
try:
    lazy_import('no_turtle')
except ModuleNotFoundError:
    print("'no_turtle' package not found, install by $ pip install no_turtle")
    sys.exit(1)

Filtering:

More advanced filtering can be implemented,
but for now it is simply exclusion of the highest level package

from importlib.util import *
exclude_lazy('tkinter')
tkinter_ttk = lazy_import('tkinter.ttk')
print(type(tkinter_ttk))     # module

import tkinter
print(type(tkinter))         # module

Syntax:

For now, it is simple functions, but it can be wrapped up in dedicated syntax.
Which is a benefit as initial implementation is independent of syntax, making it simpler and more modular design.


Bonus:

import time
from importlib.util import *
tkinter = background_import('tkinter')
print(type(tkinter))    # lazyimport
time.sleep(1)
print(type(tkinter))    # import


3. Implementation

101 lines of pure Python code: gh-140722: Improve lazy functionality of `importlib.util` by dg-pb · Pull Request #140723 · python/cpython · GitHub



4. Pros and Cons versus the path of PEP810

Cons:

  1. Does not support from module import attribute. However, PEP810 does not support from module import *, so this is just a slight functionality downgrade, which is already incomplete. As opposed to something which provides “complete syntactic substitution”. Also, there are no obstacles extending this to support it and it might be a very natural extension once more general “deferred evaluation” happens, which ideally given time one day will provide the most optimal path for more general problem.
  2. Slower - might not be suitable to just import everything lazy without consideration.
  3. Open to extend this list.

Pros:

  1. Simpler
  2. More explicit
  3. More interactive
  4. Provides more control
  5. Manages errors at definition
  6. Does not venture into black magic
  7. Solves async import problem off the shelf
  8. Makes use of all the good work done in importlib
  9. Does not require dedicated syntax for functionality to work
  10. Is orthogonal to any future endeavours in “deferred evaluation”
3 Likes

Do you mean PEP 810? PEP 801 is not available.

2 Likes

I think this is the key difference and the reason why your implementation can be so much simpler. Simple is good so if this can be sufficient to meet peoples needs, it’s a better design, IMHO.

The question is how common is from x import y and from x import *. Some public package code analysis could measure how common it really is (look at top X projects on pypi, for example). To me, not supporting import * is not much of problem. It’s kind of bad style Python anyhow.

Not supporting “from x import y” is a pretty serious downside. That pattern is very common. So if I have a library that I want to convert to lazy imports, I’d have to change all of those “from” imports to “import” style. That’s a more invasive code change.

In recent years, my preferred style for imports has changed. Where I would used to do:

from foo.bar import baz

baz()

I now prefer to do:

import foo.bar as _bar

_bar.baz()

Expecting people to change their import style in order to make use of lazy imports is maybe too much. OTOH, this restriction does simplify the implementation a lot so maybe a good idea.

8 Likes

Same here.

“Namespaces are one honking great idea – let’s do more of those!”

Big dependencies especially.

From what I have observed Python has module attribute access fast enough so that the downside is hardly felt, while my global namespaces are cleaner and lower risk of name clashes.

True, I agree that it is useful.

But it can be added later, given some similar mechanism of PEP810 is deemed appropriate.
And if the community can wait on this a bit longer, once more general “deferred evaluation” has moved forward a bit, the mechanism for this will be of higher quality.

And if it doesn’t happen, then eventually can bolt on exactly same mechanism as in PEP810, but all the steps up to actual attribute reification would be as per above. i.e.:

up = lazy_import('turtle', attr='up')

print(type(sys.modules['turtle']))    # lazyimport

try:
    ACTIVE = lazy_import('no_turtle', attr='up')
except ModuleNotFoundError:
    sys.exit(1)

So many of the benefits would be transferred.
And the things that require “black magic” (or hopefully some more ingenuous way of the future) would not span across the whole “lazy imports” machinery, but would be limited to lazy from module import attr pattern.

[quote=“dg-pb, post:4, topic:104608, username:dg-pb”]
And if the community can wait on this a bit longer, once more general “deferred evaluation” has moved forward a bit, the mechanism for this will be of higher quality.
[/quote]

There is no such proposal made as a PEP, and you don’t provide any proof that any deferred evaluation mechanism is actively worked on, so this argument makes no sense to me.

If you’re making a counter-proposal to PEP 810, it needs to stand on its own, handwaving to an idea of a proposal is not good enough IMO.

I also struggle to see where exactly this proposal is much better than PEP 810 but I don’t understand the import system that well.

What does this do with non-module objects explicitly inserted into sys.module by the module that was previously lazily imported? (see e.g. typing.re or os.path)

Am I reading this correctly that this changes the behavior of all import statements, anywhere, in any file if the package is lazily imported anywhere else? If so, this is a lot of magic at a distance and will without a doubt cause bugs where imports are not happening as expected.

1 Like

Thanks for sharing a concrete proposal and suggested implementation! This has been really interesting to read through and you may have nerdsniped me a bit with the draft PR … :sweat_smile:

While this is intended as an alternative to PEP 810, I would rather avoid splitting that discussion up into multiple threads (the existing thread is already long enough to make the SC’s job unenviable); so I’ll try to avoid comparisons and focus on your idea here on its own merits.

The potential for a single lazy_import() call to change the meaning of later import statements elsewhere in the code (including in other modules and third-party dependencies) makes it extremely hard to reason about what code does.

Imagine I write a library and use import statements to import my dependencies. Now any user of that library could, in their code, lazy_import() one of those same dependencies before importing my library. Suddenly the execution order of code in my library would change in ways that are basically impossible for me to predict and test for.

As the library developer, how would I fix this? If I have any non-trivial dependencies, it is not realistic for me to understand the implementation details of each of them well enough to know which (if any) import has any side effects that may cause issues if lazily imported[1]. So … would I need to replace all my import statements with calls to eager_import(), just to be safe?[2]

(Or the other way around: Let’s say my code imports two third-party libraries, a and b, in that order; and a also imports b internally. Then, whether b is lazily or eagerly imported in my code depends on an implementation detail of a.)

While more selective than PEP 690’s global approach to lazy imports, this shares some of the effects and issues cited in that discussion. The reasons cited in the rejection message for that PEP apply here as well.

That only applies to ModuleNotFoundError, though—right? Something like

try:
    lazy_import('my_module')
except ImportError:
    # handle as appropriate

might not work as in the eager case, because ImportError can be raised during execution of the imported module. (I.e., long after the lazy import.)

That may be an acceptable trade-off, if sufficiently documented, but it takes away some of the simplicity and makes it harder to teach and reason about. Suddenly, learners have to understand internal steps of the import mechanism to understand which types of errors may be raised at what stage of the import process.

This is not quite a fair comparison to me. from x import * has a few legitimate use cases, but I would consider it bad style in most situations and try to avoid it. (I can think of perhaps a handful of cases where I actually use it in my projects[3].) In contrast, I use from x import y very commonly; often multiple times per module.

In some ways, yes; but as noted above, in other ways things have the potential to become quite messy and have far-reaching side effects.

As above—only ModuleNotFoundError, not necessarily other ImportErrors.

In fact, to some degree it breaks the existing dedicated syntax (because users can no longer be certain whether an import statement is eager or not). Introducing a new function[4] and recommending it to replace an existing statement seems pretty unprecedented to me.

As @ajoino said above (and, I believe, a few people said in the PEP 810 thread), if there are no specific ideas currently being worked on, this is much too handwave-y to be of much use.[5]

Also, there are already some kinds of deferred evaluation in Python (e.g., with lambdas); as far as I’m aware those don’t share a fundamental mechanism with any lazy import implementation (existing or proposed), so it’s not a priori clear to me that any future deferred evaluation proposals are especially likely to do so.


  1. in more complex cases, I’d even have to consider the relative import order of multiple dependencies, which could lead to a combinatorial explosion ↩︎

  2. Or call exclude_lazy() on everything I import, which I think is equivalent? ↩︎

  3. including a few legacy instances that I’d like to clean up at some point ↩︎

  4. in importlib.util no less, not even a builtin function ↩︎

  5. It’s a fully generic counterargument that works against basically any change to the core language. “A built-in syntax for f-strings? Nah, let’s stick with str.format() in case there are any future endeavours in string interpolation.” :wink: ↩︎

3 Likes

I am working on it. It is a slow burn and a long term commitment.

Exactly for the reason that I am exploring everything possible so not to venture into implicit variable substitution into namespaces.

If you are interested, drop me a PM, I will give you a list of threads I have started and participated in over last few years.

Thus you can understand my frustration seeing that it was decided that it is a good idea to do it for something that technically is just a small subset of a more general concept.

If there was no other way to get benefits here, I would be fine, but to me it seems that it can be avoided. With some drawbacks, but also non-trivial benefits as well.

Could you please clarify?

This is how importlib.util.LazyLoader works already.
It has been around for a while and there have been no issues.

It is as much magic as usual import mechanism.
It adds module to sys.modules.
Just the module is lazy.

Having that said, this can be changed.
There is no restriction on it.
It can be made so that on simple import statements pending lazymodules in sys.modules are resolved.

I just thought that, if there are no issues with it, it is a nice possibility - given initiatives of making everything lazy.

E.g. If i want to import numpy lazily, but there is 3rd party library that imports numpy eagerly, then this is it - I need to choose - lazy import or that library. This proposal has some space to explore the possibility of not needing to choose.

So just to sum up - this proposal is flexible in this regard.

Se my reply to @MegaIng - this proposal is flexible in this regard - it is trivial effort to make it one way or the other.

Yes, without evaluating module, this is as much as possible to achieve - some pre-checks. But some, I think, are better than none. And I think that the fact that it checks whether something exists in import path at all is a step in the right direction.

Agreed, from this perspective, it is much easier to know that it never raises any errors at all.

I agree that from module import attribute is much much much more common. That wasn’t my point here. My point was that if it managed to provide “complete syntactic substitute”, including all bits and pieces, then I could see how implicit mechanisms employed could be justified.

How many of them are for slow-load-time-modules where lazy importing is needed?

Numpy is always np, pandas is always pd. scipy.stats is always sps. Personally, I have standard abbreviations for ~30% of standard library and all of 3rd party libraries that I use.

We are not taking about all the imports, only those subject to be lazily imported. And the proportion of those is small, thus I believe some compromises can be made to keep things simpler.

Not necessarily. You used this argument in few places already - my bad for not emphasising it in main post.

Again, I am and have been working on this. Picked up where David Mertz left it and it is 1 out of 3-5 directions that I am working on. And to avoid resorting to implicit namespace variable setting, some proper breakthroughs are needed, thus I can not promise anything.

And if I will not come up with anything better I will likely not propose anything that uses implicit variable setting techniques - in my opinion this is slightly beyond what should be acceptable.

What does currently happen with lazy_import('typing.re')? What happens if it’s later resolved fully?

What are the usage numbers? How many people have tried to use it but encountered issues because of action at distance?

The alternative proposal has close to zero issues in this regard because (assuming the global option isn’t used) no import x statement changes behavior. This proposal may change this behavior - the burden is on you to proof that this isn’t an issue.

ModuleNotFoundError

re is an attribute, not a module.
import typing.re doesn’t succeed either.
Maybe you meant something different and this is just not a good example for it?

It is flexible in this regard. If there is consensus that it should not change behaviour, it is trivial to adjust.

However, if the behaviour change seems attractive, having things a bit closer to what PEP690 attempted, but a bit more explicitly, then there is space to explore what would it mean.

For the time being, we can assume that it would not change any existing behaviour - it is easy to make __import__ resolve any lazymodule objects when they are encountered.

Or could just have 2 functions:

  1. lazy_import - standard import resolves lazymodule eagerly
  2. deeplazy_import - standard import picks up deeplazymodule from sys.modules as is without evaluating.

And those who wish can use (b) if all is working as expected and they are happy with the risks.

lazy_loader has adopted it and it seems to be pretty popular library.

It works similarly:

from lazy_loader import load
np = load('numpy')

type(np)
<class 'importlib.util._LazyModule'>

type(sys.modules['numpy'])
<class 'importlib.util._LazyModule'>

import numpy
type(numpy)    # <class 'module'>

It resolves on standard import only because it tries to access __path__ / __spec__ attributes that resolve, but for recursive lazy imports to be possible these can’t be resolving, thus this change is all it took to result in simply picking up whatever is in sys.modules.

The mechanics are as it should be now - attributes that do not need module evaluation should be available to access without doing so.

But if consequences of this are undesirable, it can be changed back via explicit handling or made available via different way - e.g. my post above.

Just to point out that it’s entirely possible to have lazy imports with current python features that support from: GitHub - Sachaa-Thanasius/defer-imports: Lazy imports with regular syntax in pure Python.

I think you’re vastly underestimating the benefits of lazy imports as a language feature, and if people thought the existing way of doing this was sufficient, they would propose one of the bits of prior art that are pretty well tested, and used all over the place already (see SPEC 1) for stdlib inclusion rather than go through all of the effort of adding a syntactic feature.

7 Likes

I don’t think I do, neither I think that I don’t. To me personally, this is not the point and I have no strong preference. Personally, I am ok with functions and if dedicated syntax is not necessary, then it keeps syntax cleaner.

Another argument against dedicated syntax is that there is scope for customisation of functionality - when errors are raised, whether simple imports propagate lazy imports or not, etc… So being able to provide arguments to a function has benefits. See signature of lazy_loader.load and various options noted in this thread.

But if syntax is desirable, then it can be added on any mechanism - the difference is that LazyLoader works just as well without it.

In either case, I am more about underlying mechanism and whether complexity at the level of compiler can be avoided by doing some trade-offs, in this case from module import attribute pattern.

I think that methodology of PEP810 would be more appropriate for a generic lazy expression. If the decision is to use the hammer as such with all of its consequences, then why limit it to imports?

What I did here adds submodule functionality - it solved the issue indicated in lazy-loader/src/lazy_loader/__init__.py at main · scientific-python/lazy-loader · GitHub . Which made capabilities of this approach a bit closer to what PEP810 offers.

Thank you for this. Seems like there might be some good ideas to extract. However, from first glance it seems quite heavy - doing ast rewrites, replacing all Finders and similar. Also, same issue:

"In a nested import such as ``import a.b.c``, only ``c`` will be lazily imported. ``a`` and ``a.b`` will be eagerly imported."

And also:

"The modules being imported must be written in pure Python. Anything else will be imported eagerly."

I think there are some good explorations going on.
I am not aware of many of them while I am sure PEP810 authors are much more informed.

But if what I did here offers a small piece of a puzzle towards making use of importlib.util.LazyLoader, then it could be the case that exploration in this direction is not yet over.

I think that if you want to have a chance of delaying PEP 810 based on it potentially interfering with future deferred evaluation proposals, rather than trying to convince people that this proposal is sufficient, it could be better to start a discussion around a concrete proposal of deferred evaluation, show where PEP 810 breaks it, and demonstrate that such breakage is worse than not accepting the benefits of PEP 810 that people perceive.

4 Likes

I find this idea extremely difficult to engage with productively because it is phrased entirely as a comparison with a different idea. I read this largely as a critique of PEP 810.

I see the assertion, said a few different ways, that this is “simpler” than PEP 810. I find that extremely doubtful. More likely it has a different set of trade-offs in terms of where complexity lives and who pays those costs.
This proposal has much worse locality of behavior than PEP 810. You could describe this problem in other ways (“not granular enough”, “too implicit”), but the basic fact remains that this is a proposal which makes it possible for an attempt to control the behavior of module A to impact the behavior of module B.

I do not think this is a pro or a con. It’s just a difference.

A new keyword which names something complex may be better than a more complicated way of doing things with existing syntax.

PEP 810 is as well. It is not being proposed as a lazy proxy technique for general use. It is only for imports.

The following usage is, and would remain, a SyntaxError:

x = lazy y()

The new lazy soft keyword would still be available for use in novel contexts, if someone were to make a proposal on the future.

4 Likes

Funny, I seem to recall having this same debate with the same person when discussing PEP 671. A vague future proposal that might some day eventuate is being used as an argument against an actual viable proposal now.

2 Likes

For me the fact that reifying a lazy import is as simple as

import foo

was a strong point of PEP 810.

And you’re discarding that, but I think things you get in return are nearly as valuable.

2 Likes

From my perspective, not supporting from imports is a huge downgrade, not a slight downgrade.

From imports - or some equivalent - are necessary in order to replace the workarounds and duplications packages currently have to do to export submodules in order to provide a nice API that works with static analysis[1]. Currently libraries that do want to implement lazy imports have to duplicate everything for static analysis as explained in SPEC 1.

This can be a significant maintenance burden and PEP-810 provides a way forward to eventually reduce this burden. Your proposal ignores this issue.

The other thing that’s ignored is ease of adoption. If someone wants to add PEP-810 lazy imports to an existing project the change is small and easy to understand (a 1 line __lazy_modules__ = [...] near the top of a module). Your proposal would require rewriting every import that wished to be lazy directly, including rewriting any from imports to use modules/submodules which is much more intrusive.


Personally, looking at this approach for use directly, it doesn’t provide any advantages (that I care about) over my current solution, and it lacks the exporting and logging capabilities I already have.


  1. while keeping import time reasonable ↩︎

3 Likes