Alternative path for lazy imports v2 and PEP810

Due to lack of time, I was only able to assess PEP810 rather superficially.
Conclusion at a time was:

  1. It has no critical issues
  2. It does deliver what it promises with good performance
  3. It has sharp edges regarding convenience and general matter of quite drastic paradigm shift

Starting from Alternative path for explicit lazy imports - #41.
By now, I have alpha version of an alternative, which is Pure Python and purely external.
I have hit most of milestones by now and hopefully this can shed a bit more light on the matter of things:

  1. I have a working thing
  2. It does do everything that PEP810. except syntax and from ... import ...
  3. PEP810 is only 1 of 2 paradigms (both of them work nicely together)
  4. It is compatible with 3.8 - 3.14
  5. I have tested it on scientific apps, various plugins and scripts

So although it is still most most likely has undiscovered issues, but it is largely functional.

1. Functionality

API:

  1. import_module(name, lazy=-1|0|1|2|3|4, mby=False)
  2. lazy_imports_on(deferred=False)

Aliases for import_module are:

import_eager = partial(import_module, lazy=-1)
import_maybe = partial(import_module, lazy=-1, mby=1)

import_lazy = partial(import_module, lazy=1)
import_background = partial(import_module, lazy=2)

import_deferred = partial(import_module, lazy=3)
import_background_deferred = partial(import_module, lazy=4)

Notes:

  1. (1) and (2) work in sync, sensibly.
  2. Both paradigms are implemented:
    1. LazyLoader-like (does checks at definition)
    2. PEP810-like (leaves raw proxy in place without checks)

2. Implementation

<1000 lines of code
But ~60% is replication of importlib._bootstrap.
And it could use some refactoring.
I would say it would be roughly 300 lines of code to implement to CPython

3. Couple of examples

lazy_imports_on()

import tkinter.dialog as tkd
print(type(sys.modules['tkinter']))
print(type(tkd))
<class 'lazymodule'>
<class 'lazymodule'>
lazy_imports_on(deferred=True)

import tkinter.dialog as tkd
print(type(sys.modules['tkinter']))
print(type(tkd))
<class 'deferredmodule'>
<class 'deferredmodule'>

PEP810-like approach (deferredmodule) still leaves proxies in sys.modules, but they are as raw as possible and not doing any work in advance or raising any errors.


4. Performance

Performance on local machine is nearly identical (only 7% slower than PEP810):

           pass  code    LL    D  pep810
----------------------------------------
3.13     :  190   560   420  410
3.14     :   25   260   140  140     130

LL - LazyLoader-like
D - deferred (PEP810-like)

5. Considerations

So, by now, I like LazyLoader much more.

  1. It feels much more native and Pythonic.
  2. Minimal code change needed to adapt
  3. It is much more extensible
  4. It integrates to existing machinery rather smoothly
  5. Does leave lazy objects in sys.modules, thus any subsequent standard imports will just retrieve it. But I think it is a good thing - it provides extra performance opportunities. And also, still feels as a part of imports.

PEP810 approach is a paradigm shift and feels rather alien.
None of the applications “just worked” - it is quite big extra dimension to keep in mind.
The main advantage of it is that it does not do expensive file system operations that are costly for remote imports.
Thus, I find its approach complementary.
For now.
In the long term I will aim to solve it with specialised caching and only keep LazyLoader-like approach.

6. My conclusions

By now, it is clear to me what my intuition was telling me initially:

  1. There is a lot more to explore in this direction. To me it feels like this direction in general is expansive - the more I dig, the more opportunities I find. E.g. The checks that are done at definition can be defined by Loader and calibrated per type.
  2. And PEP810 is premature commitment.

It:

  1. Reserves the syntax
  2. Locks-in new paradigm, which is rather alien - none of the code just works after switching the global setting on
  3. Has limited functionality
  4. Provides little opportunities to extend. Or at least not easily.

Alternatively, the organic growth would look like:

  1. Incrementally develop functional variant without committing to from ... import ...
  2. Slowly explore variations and different needs
  3. Once it has matured enough (with involvement of community), nail down the syntax
  4. Finally add from ... import .... Either:
    A. If more genral “deferred evaluation” is ready - use it
    B. Implement more generic mechanism of PEP810 reification. LazyObject(func, *args, **kwds) which gets reified that can be used for anything.

Note, before (5) sys.modules would be inevitably populated, but after (5) anything can be done.


PEP810 has been accepted already.
I don’t have hopes for things to change.
But I have raised my concerns.

And I wanted to follow through.
The time provided did not allow for anything proper in time.
So this is delayed.

4 Likes

Hi, I haven’t followed your previous thread against PEP 810 too closely, so forgive me if I’m asking repeated questions.

This seems backwards. How does a function seem more “native” than a keyword? Would you consider using __build_class__ to be more native than the class keyword as well?

"lazy " is an additional five characters to any existing import, whereas “from importlib.util import lazy_imports_on; lazy_imports_on()” is 61. You’d need at least 13 imports in a file for this to start saving characters.

What additional features are you looking for? I can’t think of anything else that could be added to lazy importing.

Why does this matter for CPython? I’d see this as a selling point for a library that developed their own lazy importing implementation, but there’s not much use in “integrating to existing machinery” in CPython, since the point of a PEP is to change that existing machinery.

What performance opportunities do you have in mind? I find it quite rare for built-in language constructs to be slower than hand-built code.

Well, I don’t think we had any plans of using lazy import for something other than lazy imports.

…which is why the global setting isn’t enabled by default. How would your solution solve this problem?

Yeah, again – what functionality are you looking to add?

5 Likes

Could you perhaps share a link to some repo with the implementation and tests you ran?

Also, I think it’s over. PEP 810 hab been accepted, as you said, so why change anything about this? It could however easily be a useful tool for 3rd party code, where 3.15 (PEP 810) is not yet supported (as it is not the minimum version.

The approach itself feels more integrated into Python’s native import machinery.

Syntax is orthogonal matter.

By adapt, I mean - changes needed to existing codebases to turn the global setting on.
Not the syntax to turn it on.

The way of turning global on is also orthogonal.

I am experimenting with:

  1. Custom per-loader checks
  2. Customisable global background imports
  3. maybe flag - there is another thread about it (optional it is called there I think)

And I have only been looking at this for a couple of weeks.

In general, this approach builds on and integrates into existing import machinery, which allows much more flexibility. There are things that I have in mind currently and there are things that are not yet known.

If this is the transition - all imports are lazy, then it would make sense, but “global lazy imports” (from the PEP). So it leaves a fracture - existing import system, which is flexible and extensible and a bolt-on lazy, which discourages global imports and doesn’t make use of anything from the former.

From my experimentations I just gathered an opinion that building on top has benefits.

You have a dependency, which does eager import.

That dependency does:

import numpy

With LazyLoader-like approach (or any approach that sets to sys.modules) you can lazily set it in advance and dependency will just pick up lazy.

So one dependency that does eager extensively that needs to be imported early can pretty much eradicate all benefits.

Me neither.

It does minimal check and doesn’t break:

except ModuleNotFound:

which is most common.

From my experience, new needs arise with time.
And PEP810 is bringing this into a very tight corner.

1 Like

Code is in alpha state and unpublished. It builds on my PR in my previous thread naturally.

My intent here is high level considerations.

Benchmarks:

Python startup timings:
    ```bash
    _CXSRC='import sys; sys.path.append(...)'
    _CXON='from cx.aaa import _gm; _gm.lazy_imports_on(0)'
    _CODE='from cx.aaa import tmr; tmr.repeat'
    _CODE2='import numpy; import scipy.stats'

    timeit 20 python -c pass
    timeit 20 python -c "${_CODE}"
    timeit 20 python -c "${_CXON};${_CODE}"

    timeit 20 ./python.exe -c pass
    timeit 20 ./python.exe -c "${_CXSRC};${_CODE}"
    timeit 20 ./python.exe -c "${_CXSRC};${_CXON};${_CODE}"
    timeit 20 ./python.exe -X lazy_imports=on -c "${_CXSRC};${_CODE}"

    # -----------

    timeit 20 python -c pass
    timeit 20 python -c "${_CODE2}"
    timeit 20 python -c "${_CXON};${_CODE2}"

    ```
               pass  code    LL    D  pep810
    ----------------------------------------
    3.13     :  190   560   420  410
    3.13npsp :  190   990   210  210
    3.14     :   25   260   140  140     130
    ----------------------------------------

Tests are mostly on my own projects:

  1. One a bit heavy with scientific libraries
  2. On sublime text editor and all of the plugins (edit: sublime text is not my project. Just plugins :))
  3. Various productivity apps that pretty much import all of the stdlib plus some utility packages

Because it is never too late until it is too late.
Current “too late” is defined by bureaucracy, not actuality.
And I am not part of that game, I just do what I do.
I did some stuff on the topic and sharing it.

2 Likes

This is subjective, but I strongly disagree. In my eyes, an arbitrary function that has side effects on the import system feels about as far away from “native” as you can get.

Hm? PEP 810 provides several ways to enable global lazy imports without code changes. But, this shouldn’t be a goal anyway; the rejection of PEP 690 clearly explains why we can’t have global lazy imports.

Could you elaborate on what the first two do? For optional imports, I don’t see how that’s related to lazy importing, other than the syntax being inspired by PEP 810.

Please provide concrete examples of where PEP 810 could cause damage, not abstract ideas that you’ve been experimenting with. I could claim that I’ve been experimenting with adding a new inquisition import syntax that just so happens to conflict with PEP 810, but the SC isn’t going to reconsider their decision until I’ve shown that my feature is beneficial and outweighs the demand for lazy imports.

I think you have the idea that the existence of lazy import will make all other solutions obsolete, but that’s not true. Let’s say that you’re right and PEP 810 creates some sort of fragmentation – there would still be nothing forcing people to use it. If PEP 810 turns out to be a disaster, then people simply won’t use it, and nothing will have changed for the ecosystem. Then we can consider new options, such as what you’re proposing here.

You can still do this, can’t you? Nothing is forcing you to use lazy import in your code if you don’t want to. If you want to roll your own solution, go for it!

Then why bring it up?

Ah, this doesn’t seem like a very lazy import. To retain ModuleNotFoundError behavior, I’d imagine you still need to make filesystem calls, which aren’t exactly fast.

Sure, but I don’t think we need to delay PEP 810 on this basis. Realistically, you can argue this for any change, but the language would never evolve if we rejected every PEP with the reasoning “this might prevent the Python ecosystem from adopting a hypothetical feature in 20 years”.

If PEP 810 really does cause issues someday, we can deprecate the syntax and add something new. But for now, let’s enjoy the new feature.

2 Likes

I’d love if you could share the code! Since this works on older Python versions, that seems like an immediate and big win (all my CLIs use LazyLoader nowadays, and deals with the jank).

As for PEP 810, I don’t love the new syntax either, but I also understand the ease and simplicity of it. And you’re also right it’s uncustomizable beyond “lazy” (background importing sounds really fun), but for most common cases I’d bet whatever it does is “good enough”.

As someone who has skin in neither game, I actually think Python is big enough for both. I don’t see these as competing proposals, I think they are complimentary.

Want ease and simplicity and don’t care too much about the details? Use the syntactic sugar!

Want better control and options over lazy loading? (And want to be backwards compatible) Use the library/function!

To me, it’s like class vs type(…) or @decorator vs foo = decorator(foo) or (probably some example from typing, that shit evolves very quickly and I can’t keep up).

I’d happily look into using a library if it was on PyPI, regardless of PEP810 (and having a library people can play with strengthens your case as well)

1 Like

I have evidence that global lazy imports can work well.

LazyLoader approach worked smoothly on number of applications with minimal adjustments.

While PEP810’s global option is much more fragile and prone to errors.

All that PEP810 does is moves import module from within the function to the global scope. Not only the design itself goes against Python’s philosophy, but it obstructs the path to solution which might just integrate well.

File system calls are not an issue for local imports as it can be clearly seen in benchmarks.

It it was implemented in CPython, it is possible to implement specialised caching for each remote location and have customisation at ImportFinder level.


After experimenting with both, I have strongly converged towards LazyLoader for global lazy imports.

Exactly my point, there is a system which might just work nicely for global imports, while the syntax is being reserved for something that does nothing else, but brings import from inside the function to the global scope, while leaving errors to be raised inside a function.

PEP810 discourages global imports.
And existing codebases need quite a lot of changes for error handling.
LazyLoader approach reduces needed changes by at least 90% (rough estimate based on my experience with both variants).

It is just more elegant and easier path to have unified import system.

optional import foo
optional lazy import bar

These would compose naturally because they use the same underlying mechanism, not two separate systems.

If 2 paradigms are out of sync, then unification is only at the level of syntax - it is a fracture which might not be necessary.

“Custom per-loader checks”, I mean per-finder, to address remote import performance when doing checks at import definition. But it is difficult to do without integrating into CPython.

If import is only lazy, there is no initial cost, but it will till be a delay at the time of reification. Background imports attempts to address it.

But this is besides the point, which is that there is space for further development. And if there is one unified import system then it is possible to evolve. PEP810 makes it difficult.

As I said, this is not about pinpointing the error that breaks something, but about considering the full picture with all trade-offs. If there is something you want me to clarify so you can better understand my POV, let me know.

Yes, but the syntax is reserved by then.
Of course, can use a different one then.

defer import module

But then there are 2 syntaxes for lazy imports, deprecation processes, many unsatisfied individuals with the whole mess. In short - nasty business, which itself becomes one of the counterargument for implementation of alternatives.

So yes, this is exactly what I am doing, implemented it and very happy with it so far.
I am just reporting my insights.
It can just as well be 3rd party package.
And once macros roll out, can even have syntax for it.
The main issue is that integration of it into CPython would make it possible to bring it 1 level up, such as “specialised per-ImportFinder” caches for remote imports. And also, user experience would be smoother if it was just one native import system that just cut it well.

You did. Implications of “Reserving the syntax” has nothing to do with what you said.

The question is not whether it qualifies to be lazy by some definition, but whether it has more benefits.

To be precise, LazyLoader approach is “eager minimal checks, lazy load”.

Yes, integration into CPython would be useful to address expensive remote system calls. There is not issue for local imports.

It isn’t 20 years. This could be managed for 3.15.
As I said, if there is no progress in innovations on “deferred evaluation” side, then similar reification system as PEP810 can be bolted on while preserving features of LazyLoader approach. Also, a nice touch would be to make it generic, so that it can be used for different applications.

Yeah, likelihood of this happening…
A lot of companies with dependency hell will adapt it.
They have resources to do regime changes and will not care as long as it solves the issue.
The resistance to remove it once it is there will be unmanageable.


What I am saying is:

We might be choosing the wrong paradigm for lazy imports, and this choice will have long-term consequences.

1 Like

Thank you for reply. Agree with many points to a certain degree.

If this gets to the level where it can make a difference, I will share it.
Otherwise, it will ship in due time as part of larger library.

These are 2 different questions:

  1. What is optimal path for CPython and what is the cost of not getting it right?
  2. Given certain path that CPython has taken (in this case PEP810), what are implications for alternatives?

Here I am mainly concerned about (1).

Regarding (2), time will tell if all the nuances can be addressed in some way without integration to CPython.


What is almost certain is that:

  1. Even if all the nuances can be addressed without incorporation into CPython, it will be much less elegant and robust with a lot of code replication. Integration into CPython would look like couple of lines of code in several functions, while independent project means copying all of those functions and keeping them in sync.
  2. Even if it works extremely well and it proves to be beneficial enough for incorporation, the fact that PEP810 is merged will be an obstacle in many ways.

Where? I have evidence that they can’t: PEP 690 – Lazy Imports | peps.python.org

Please be aware that the speed of filesystem calls are very dependent on the machine. The speed of this will vary greatly based on the storage device and the filesystem in use.

Existing codebases don’t need to do anything. Once again, they are not forced to use PEP 810.

I’m looking at the trade-offs, and I absolutely think that the benefit of PEP 810 outweighs them at the moment. I’d understand your point-of-view more if you focused more on why your solution works better rather than attacking PEP 810 with vague arguments.

We have plenty of time before the beta freeze, so please come up with something that concretely has benefits over lazy import. So far, I’ve seen “it makes slow OS calls so it can raise ModuleNotFoundError”. What else?

Ok, sure, but again, this is a hypothetical “what if?” that can be applied to almost anything. If we decide to rescind PEP 810’s acceptance and roll your solution, one could still argue that “maybe this is the wrong paradigm”.


I’m finding myself getting frustrated, so I’m going to remove myself from this discussion. I don’t feel that any progress is being made, and it seems to me that you have some vendetta against PEP 810. Please try to engage this further as a “I have a better idea” discussion instead of a “this other idea is bad” discussion.

5 Likes

That PEP’s rejection shows one specific approach had issues, not that all global lazy imports are impossible. My working implementation demonstrates a different approach that avoids many of the problems.

I’ve shared performance benchmarks, compatibility evidence across several Python versions, and specific architectural advantages.
The POC exists and works.
Global option simply substitutes gh-140722: Improve lazy functionality of `importlib.util` by dg-pb · Pull Request #140723 · python/cpython · GitHub for every simple import.
Integration follows different routine, but the effect is exactly the same as manual substitution would be.

All information I have provided is clear and unambiguous, unless you have the reason to think that I am lying.

That is fair, thus specialised caching might be needed even for local imports.

Language features have ecosystem-wide impact even when optional. They create precedent, reserve syntax, and influence future development directions.

New information improves estimate quality (on average), not reduces it.

Of course, exactly, this is why I did this, to gather as many perspectives on the matter as possible. I looked at PEP810, implemented LazyLoader approach and also similar approach to PEP810 and spent time experimenting with different applications.


My initial unease was rather strong.
And it is now supported further from implementing both approaches.
The more I’ve built, the more convinced I’ve become that LazyLoader offers a better foundation for lazy imports and there is a path for smooth integration.

This thread is to share my findings.
And to respond if there is any interest.

1 Like

It seems to me pretty clear that there’s little interest if you insist on framing this as something to replace PEP 810.

If you want to make it available as an alternative, then from what you’re saying it doesn’t require any core changes, and can be published as a 3rd party package on PyPI. Interest can then be determined by how many users that package gets. There is existing evidence that solutions like importlib.utils.LazyLoader aren’t sufficient to satisfy people, so you can assume that unless you get more users than those existing solutions, your approach is also not enough. You can also interact with your users, who by definition will be interested in your approach.

If all you want to do is keep debating the question here, rather than making what you have suitable for production use, then I, for one am certainly not interested, and I don’t know what you hope to gain from such a discussion (PEP 810 won’t be withdrawn, no-one will start using your alternative, and people will simply get tired of endless discussion with no conclusions, and drift away).

7 Likes

I’m not going to go over the points I made in the previous post again, but I will point one specific thing out.

I consider this a negative. I think it’s much clearer that if something is in sys.modules then it has been fully imported. Although there are tools that may put deferred/lazy modules in there, they are currently the exception rather than the norm.

There is a useful pattern that checks sys.modules to see if a module has been imported in order to check the type of something without importing the module itself. The logic being that if the module hasn’t been imported, then the object can’t be something that has to have been created by that module.

It looks roughly like this:

import sys

def is_dataclass(obj):
    if dataclasses := sys.modules.get("dataclasses"):
        return dataclasses.is_dataclass(obj)
    return False

Here’s an example from annotationlib and one from dataclasses doing a similar thing.

By putting the lazy object in sys.modules these checks will now retrieve the lazy module object and trigger the import they were written to avoid.

2 Likes

Well yes, of course. This thread is about possibility of it becoming a norm.

I can not see an issue.
The import was called.
The module is in sys.modules.
It detects it and calls appropriate function.

This is in line what would happen without lazy imports.
And there are some assurances that module will most likely succeed importing as Finder has done its work.

While if the import was called, but the module is not in sys.modules, the above code does not function in any remotely predictable manner compared to how it worked with standard imports.

So before and after it does the same thing.
But as the import is lazy, it is delayed until it is needed.
So far all is consistent and well behaved.
If one wants to take extra advantage, then extra explicit check can be added:

def is_dataclass(obj):
    if dataclasses := sys.modules.get("dataclasses") and not getattr(dataclasses, '__lazy__', False):
        return dataclasses.is_dataclass(obj)
    return False

I think whether it is a good idea depends on implications and results and not on whether it is currently a norm. I still have reservations, it can indeed feel a bit unnatural, but I haven’t encountered any issues yet - just turned global on and everything works as predictably as it can be expected.

And it is debatable what is further from the “norm” - setting in sys.modules or not. Having called the import and not seeing any effects of it, IMO, is at least as “non-norm” as this.

Thank you.
This is exactly the kind of discussion that I think is beneficial: concrete tradeoffs between different approaches.

Yes, my assertion is that it should not become the norm and an example of an existing pattern that is broken by this.

The function works perfectly consistently if a PEP-810 lazy import has been set up but not reified.

If lazy import dataclasses has been used somewhere, but dataclasses is not in sys.modules then the object being checked can’t be a dataclass as if it was, then the lazy import would be reified and dataclasses would be in sys.modules.


Agree or not, I’m not going to make any further points in this thread. Ignoring from imports makes this less useful to me than what I already have.

But nothing is broken.

I’d argue PEP 810 creates a larger semantic break - an import is called but leaves no trace in the system’s primary import state. At least with lazy modules in sys.modules, the operation’s effects are visible, just delayed.

I would argue that it works more consistently. And extra step to take benefit of lazy import in this specific instance is reasonable, being rightfully explicit.

This is about the backend, not API.

As I have noted, syntax is orthogonal and same trick from PEP810 can be applied for this as well to have from.

The intent to rule out an object being a dataclass based on the module that defines dataclasses not being loaded is broken by reifying the unloaded module. Rather than quickly determining the object is not a dataclass the code causes and otherwise unneeded module to be fully loaded. It is not broken in that an error occurs, it is broken in that the wrong thing happens.

1 Like

It does not “break” the code.
One more step is needed to achieve what is intended.
And given that design is understood this is expected behaviour.

Note, that when code was written at a time, there was no intent to predict how lazy imports will apply to the case. Thus, “No intent (or code)” is being broken.

But this is regarding that specific example. The issue might exist for patterns that are other way round. i.e. assume that certain functionality is present given something exists in sys.modules.

However, this is anti-pattern as None is a valid entry in sys.modules: 5. The import system — Python 3.14.0 documentation

Thus, any cases that assume that the module is loaded given there is an entry in sys.modules are technically bugs.

They all need to do additional checks. And given lazy import system would also allow lazy module storage, this would be one additional check required.

Yes, in the scope of standard library, one can expect that certain modules will never be None, however, these are very rare (and I would argue deserve extra handling for the sake of consistency):

cd .../cpython/main
g 'if.*(?<!not )in.*sys\.modules' | grep -v import | grep -v test | grep -v rst
| wc -l

7

It is already a pattern that is not to be used naively.
Existence of lazy modules in sys.modules would not introduce anything new and would just add one more nuance.

To sum up:

  1. If the module is not in sys.modules, one can safely assume that it is not loaded
  2. If the module is in sys.modules, there is no assurance that it is loaded - extra checks need to be done.

And lazymodule does not break the above, but adds extra case to the (2).


Thank you for this, this is a good point to note.
However, in the large scheme of things of various implications of lazy imports this turns out to be trivial.


Even in the present state of things, utility module_is_loaded(name) would be of benefit.

If the entry is None the module has not been imported and the check still works as intended because None is falsy. Anything trying to actually import the module should fail, but that’s not what the check is trying to do.

2 Likes