PEP 810: Explicit lazy imports

No, because the default value is shown in the --help output:

I would like to see the work that has been done.

Specific repository forks I can check out and run with the forked interpreter.

1 Like

Well, function Parameter defaults are always evaluated before the function is called (at definition time). That means that this wouldn’t work for any kind of lazy import system.

That was exactly me point.

Lazy imports are easy to understand… but then adapting an existing codebase to actually benefit from lazy imports often turns out to be quite an involved task.

Other ecosystems took other approaches:

  • tree shaking and “compile-time” module splitting in js/webpack/tollup/esbuild
  • compiled languages w/ -ffunction-sections -fdata-sections
  • rust where feature flags remove entire subgraphs
  • android/.net/ios have their own hacks for fast app startup / delayed loading

I think there’s quite a bit to learn from these.

Or work to make Python “faster” overall, e.g. spread module parsing and evaluation across hardware cores, the existing import syntax providing clear and deterministic granularity to do so.

Well, I thought about splitting tasks across cores to make Python faster before, but even as it stands, doing everything we could, Python is still interpreted, and therefore ‘slow’. There will always be limits (and new possibilities too).

If someone really wants to make, say a fast CLI, they would know when to use Python (or rather: When not to use it).

There are many ways to make code fast, and there are a billion possibilities for optimizations of Python.

This lazy import syntax is just the tip of an iceberg, and the basis for future work that might make imports faster. I don’t think, its possible to address everything in one PEP, but I’d love to see you ideas in a PEP once this one has been finished.

1 Like

Why is it not sufficient to look at the listed projects that have rolled their own implementation of lazy imports? You could peruse their history and even review the discussions of those changes as they happened. You are asking for a lot of work to be done to satisfy your curiosity.

7 Likes

What does “calling in a thread-safe manner” mean? Do you just mean “it’s guaranteed not to be called from multiple threads”, or “concurrently from multiple threads”, or “concurrently from multiple threads for the same module”? It’s not, for any of these. There’s no locking. It’s just a function that’s called as part of the import mechanism, just like many other functions. An implementation that does thread-unsafe things has to perform the relevant synchronisation itself. I’m not sure it’s ever going to matter, really. I mean, the simple case is a dict or set lookup, which is not a problem. If you need to do a lookup in a container that doesn’t do thread-safe lookups, or if you need to do complex shared state manipulations in the filter function, you’ll have to lock accordingly. None of this is specific to this proposal, and I think trying to spell it out every time sets a bad precedent and a bad example.

… but you’re okay with the answer?! :joy: (FWIW, it’s a meme Pablo likes.)

The mechanism isn’t being advertised as the way to reify all values in a module, it’s just something that’s done to keep existing code behaving as expected. Perhaps we should have an explicit way to reify everything in a module, it hasn’t really come up yet.

I think __lazy_imports__ is too close to __lazy_import__, and also doesn’t convey that it lists modules.

(Replying to this one as a stand-in about the suggestion to allow the lazy imports filter to bypass the global disable.)

I think no, the library not working should be fine. The global disable isn’t meant to be a general purpose tool with common use. It’s there for the convenience of people who want to test their own modules with lazy imports disabled. I also think it’s fine for that disable to not be an option for libraries that don’t want it to be an option, and I don’t think it’s a good idea to have a complex hierarchy of enables/disables like a game of top trumps. We may end up needing this but the exact shape is unclear until we have real-world experience with this. I’ve added it to rejected ideas for now (with a slightly expanded rationale) in PEP 810: Update rejected ideas. by Yhg1s · Pull Request #4634 · python/peps · GitHub.

3 Likes

There are a couple comments about this issue. It is the latter; our intent is that __import__ resolution happens when reification occurs. We don’t capture the state of the import machinery at the lazy import statement. The PEP says

Reification imports the module in the same way as it would have been if it had been imported eagerly, barring intervening changes to the import system (e.g. to sys.path, sys.meta_path, sys.path_hooks or __import__).

Do you think it would help to clarify this in the PEP?

7 Likes

It’s a little confusing because it’s saying “the import will be the same…except for all the ways it can be different at that point”. I’m not sure what remains after the cases listed in the parenthetical statement.

It might be clearer to switch it around to “lazy modules are imported using the import system at the time of reification” and if there are any exceptions to that statement list what they are.

9 Likes

I agree this is the best way to explain it.

There’s enough mutable state in the import system that I think the only two viable lazy loading options are "look up the module eagerly without executing it (the LazyLoader approach, which the PEP has good reason not to use), and “don’t eagerly look up anything, not even the value of the __import__ hook” (which is the PEP’s approach, albeit not expressed entirely clearly yet).

3 Likes

Is it though? The import system doesn’t get invoked until reification, but the filter function has to get called at the point of the lazy import statement (i.e. IMPORT_NAME, IMPORT_FROM), to determine if the import is lazy or eager, right?

Ah, that’s :banana: :banana:!

But yes!

3 Likes

The filter function is not protected by any locking but it won’t crash if is called concurrently or something like that. If two threads execute potentially-lazy imports of the same module concurrently, the filter will be called concurrently from both threads. After the filter returns, either lazy objects are created (no synchronization needed) or we enter the standard import machinery (which has its own locking) If stronger guarantees are needed because the specific implementation handles some shared resource then it must handle its own thread-safety. For read-only operations (like checking against a frozenset), no special care is needed.

3 Likes

That’s totally reasonable semantics. I think the PEP should add some language to clarify that.

2 Likes

I don’t mind but I think @thomas has a point that this is kind of implied and that if we start to have to write it everywhere (not only on this PEP but other places) it may be a slippery slope. I will let you two to discuss offline :wink:

2 Likes

Some time ago, @ctismer worked on implementing the ideas from PEP 690 for the PySide module (details here) which ended up showing a very nice performance for the startup time of PySide application of about 10-20%. In frameworks with tons of things are initialized at import time, I believe this would be a great addition to the language.

Having an explicit keyword to have this as an opt-in is IMHO the best approach for this new functionality, because I do know a few things can get a bit messy if this was enforced everywhere.

5 Likes

IDEs and debuggers should be prepared to display lazy proxies before first use and the real objects thereafter.

The PEP doesn’t document the proxy objects at all. So how are tools like debuggers to know they are working with a proxy object? Will the type be exposed in types? Will there be any other exposed data that tools like debuggers can show to the user, or will it just be in the repr? I personally only care about the former and the latter is more of a question than request.

1 Like

Yes! Either in types or importlib. Any preference?

How would you have a lazy import that was triggered away from the import statement go back and have the exception raised in the try block? It’s very “spooky at a distance” semantics that is already tricky enough with the lazy imports (at least the PEP has the exception trigger where the code causes the import). Toss in exceptions suddenly coming from some other line of code and not where the code is executing and it seems like asking for trouble. Plus you would have to save the whole stack at the time of lazy import for that to work and it just becomes a mess and way too complicated to tackle.

As well, for your import failure case, you would want it to find the module first and then if it exists go on to be lazy. That’s how LazyLoader does it already, so you can get that today if you want those semantics. Otherwise you could use importlib.util.find_spec() to see if the module can be found and if it isn’t then do something else.

I realize this all isn’t as convenient as magically having lazy import work in a try statement, but I’m afraid this is an instance where we can’t have everything be as simple as with eager imports.

Have you done an analysis to back that statement up? For instance, I see most imports import modules, so it really depends on what codebases you’re exposed to. As for side-effects, technically every import is a side-effect since it’s an exec() call on the source (for the default, pure Python case).

But ignoring all of that, making eager the opt-in is a bit hostile towards beginners where reasoning about lazy imports is harder than lazy imports. In that spirit, I don’t think asking us to type 5 more characters when the perf matters is a big deal.

5 Likes

Nope, just somewhere. I don’t think any import-specific types are in importlib that are this far-reaching, so probably types makes the most sense?

3 Likes