Extended syntax to perform error handling for lazy imports

While PEP-810 introduced a neat facility for lazy imports, it explicitly chose to omit any error handling, so codes that need to handle import errors must settle with either eager imports in try-excepts (and bear the cost of increased load times) or the workaround of lazily importing wrapper modules that handle import errors (and bear the cost of maintaining those small boilerplate modules that clutter up the repo).

This proposal aims to address the common use cases of handling import errors, which are needed more for lazy imports than for eager imports because it is more unrealistic to expect every first use of a lazy object to be wrapped in a try-except.

While there is already an ongoing discussion with a similar goal in Optional imports for optional dependencies, it focuses more on only one of the common use cases but not the several important others, so I believe a separate discussion is warranted for a more comprehensive solution.

As far as I can see, there are 4 common ways that an import error is handled:

  1. Import a simple drop-in replacement module:
try:
    import regex as re
except ImportError:
    import re
  1. Raise a custom exception:
try:
    import yaml
except ImportError:
    raise RuntimeError("Please install pyyaml.")
  1. Retry a failed import after resolving the issue:
try:
    import xlrd
except ImportError:
    os.system('pip install xlrd')
    import xlrd
  1. Import a mostly compatible replacement module that requires different initialization steps to produce the same set of names and flags (load_yaml and HAS_RUAMEL_YAML in the following example):
try:
    from ruamel.yaml import YAML
    load_yaml = YAML().load
    HAS_RUAMEL_YAML = True
except ImportError:
    from yaml import load as load_yaml
    HAS_RUAMEL_YAML = False

The Proposal

Introduce extended syntaxes for both eager and lazy imports that specify the behaviors upon anImportError.

Addressing the use cases listed above, the extended syntaxes below apply to both eager and lazy imports, in both import and from-import forms:

  1. Use [lazy] import <module 1> or <module 2> as <name> to specify a drop-in replacment for a failed import, e.g.:
lazy import regex or re as re
  1. Use [lazy] import <module> or raise <exception> to raise a custom exception on a failed import, e.g.:
lazy import yaml or raise RuntimeError("Please install pyyaml.")
  1. Use try [lazy] import <module> or: <suite> to specify a suite to perform on a failed import:
try lazy import xlrd or:
    os.system('pip install xlrd')
    import xlrd
  1. Use try [lazy] import <module 1> for <names>: <suite 1> to specify a suite to perform on a successful import in order to produce the names specified in the for clause, which, importantly for lazy imports, will become lazy objects that will trigger reification on access. An optional clause of or import <module 2>: <suite 2> specifies an alternative import and initialization steps upon a previous import failure in order to produce the same names. And an optional clause of else: <suite 3> specifies what to do when all the previous import clauses fail. An ImportError should be raised if an import clause succeeds but does not produce all of the names specified in the for clause. For example:
try lazy from ruamel.yaml import YAML for load_yaml, HAS_RUAMEL_YAML:
    load_yaml = YAML().load
    HAS_RUAMEL_YAML = True
or from yaml import load as load_yaml:
    HAS_RUAMEL_YAML = False
else:
    raise RuntimeError('No YAML loader installed.')

assert load_yaml('key: value') == {'key': 'value'}

This is just a rough idea so far and I’d like to hear some preliminary feedbacks from the community before I proceed to formalize the syntaxes and semantics.

1 Like

Using the lazy keyword for something which isn’t fully lazy (it must, at a minimum, find the module’s spec to determine if the module to import is present) seems like it will cause a lot of confusion.

Also, what is the sudden rush here? The status quo has been fine[1] for many years now. And lazy imports won’t be available until Python 3.15. Can we not wait until users have had at least one release with lazy import syntax available before proposing yet more changes to the import statement syntax?


  1. for everything except lazy imports ↩︎

7 Likes

How about:

# main.py
lazy: # or defer
    try:
        import regex as re
    except ImportError:
        import re

To be syntactic sugar for:

# re_wrapper.py
try:
    import regex as re
except ImportError:
    import re

# main.py
lazy from re_wrapper import re

The main difference is that there’s no file system operation to access re_wrapper.

2 Likes

It seems like all of this can be done using a MetaPathFinder instead of dedicated syntax, with the further benefit of applying application-wide and applying to multiple use-cases (i.e. fallback package, custom exception, custom function call, etc)

3 Likes

As these mostly seem to handle cases where the module is not installed, these can be handled by checking that specifically. Currently the best tool I’m aware of for doing so would be importlib.util.find_spec. Your examples would roughly translate to this under lazy imports:

from importlib.util import find_spec

# 1
if find_spec("regex"):
    lazy import regex as re
else:
    lazy import re

# 2
if not find_spec("yaml"):
    raise RuntimeError("Please install pyyaml.")

lazy import yaml

#3
if not find_spec("xlrd"):
    os.system('pip install xlrd')

lazy import xlrd

#4
if find_spec("ruamel.yaml"):
    lazy from ruamel.yaml import YAML
    load_yaml = YAML().load
    HAS_RUAMEL_YAML = True
else:
    lazy from yaml import load as load_yaml
    HAS_RUAMEL_YAML = False
2 Likes

To clarify, there isn’t anything in this proposal that’s half-lazy.

All that this proposal does is specify what to do upon a reification. It’s code that’s executed only when a lazy object is reified.

This is because before PEP-810, try-except works well enough to handle all kinds of failed imports, but now with lazy imports, there is no good way to specify how a failed reification should be handled.

This proposal is not about a half-lazy import that checks the spec first. It’s an extension of PEP-810 to describe what happens when reification fails.

1 Like

This effectively does an eager import as the lazy import is immediately reified.

This is the simple way to do it. However, new utility function is needed as find_spec doesn’t cut it properly.

find_spec('scipy.stats')

will eagerly load scipy module.

1 Like

Ah, true for this case. You’d need to either define a function to wrap it, or more cleanly move this import to another module and lazily import that.


Fair, however I’m not sure the additional complex syntax is an improvement. For your final RUAMEL_YAML example for instance, this has presumably made the entire block lazy in order to not make the same error I had, which is not obvious.


This is true, but I doubt you’d check for the presence of scipy.stats as that should be implied by the presence of scipy. I think checking submodules should only be necessary for namespace packages (such as ruamel.yaml).

1 Like

I think you’re being too quick to reject wrapper modules, which are the solution which I would reach for.

When you combine lazy imports with other language features (the new syntax or existing techniques) you have to adapt your code. It’s not just error handling. Consider this example:

# main.py
lazy import requests

class MySession(requests.Session): ...

Uh-oh! It’s not lazy anymore!

But the easy solution is…

# main.py
lazy from ._session import MySession

# _session.py
import requests

class MySession(requests.Session): ...

Factoring out functionality into modules is already a common task for projects, so existing project layouts already support adding – potentially many of – these shims. To me, it’s not even a workaround: it’s a pragmatic and simple solution.

Speaking as someone who expects to do rather a lot of integration work with the new syntax on existing projects, I’m not eager to see more features and syntactic forms added to this space. I haven’t even gotten to try it out on real codebases yet, nor have most other users. I think working off of practical experiences, e.g. “I had to add this workaround dozens of times”, is a much more productive way to drive a conversation about further enhancing lazy imports.

11 Likes

Then let’s wait until lazy imports have seen some real-life usage, and address the problems that we discover from experience are causing issues for users. At the moment, all we have is theory and speculation, and that’s not a good enough basis to introduce a language feature.

People were complaining that PEP 810 was rushed. A new PEP proposing fixes/enhancements for a language feature that hasn’t even been released yet seems even more rushed, IMO.

9 Likes

Thanks. The idea of embedding a wrapper module is great, as it offers the main benefits of a wrapper module without having to maintain a separate file just for a few lines of boilerplate.

The problem with your specific syntax of a generic lazy block though is that the compiler would have no reliable way to tell which names in the block are to be made lazy objects, since any statement, not just an import statement, is valid there.

The benefit of a dedicated syntax is that it would gain support from static code analyzers, which for example can more easily identify an alternative module and offer appropriate type checks based on the alternative.

But yes, you’re right that this can be done with a custom meta path finder.

Together with the idea of an embedded module inspired by @Nineteendo’s suggestion, I’ve written a proof of concept, EmbeddedModule, a marker class with the following usage:

class any_re(EmbeddedModule): # creates any_re as an "embedded module"
    try:
        import regex as re
    except ImportError:
        import re

lazy from any_re import re

The implementation uses a custom __build_class__ function to store the class’ code object in an EmbeddedModule instance without actually executing it.

Upon import, a custom meta path finder would look up the module name being requested in the importer’s global namespace, and if it is an EmbeddedModule instance, it would then invoke a loader that executes the stored code.

Here’s the implementation:

import sys
import builtins
from importlib.abc import Loader
from importlib.util import spec_from_loader

class EmbeddedModule:
    def __init__(self, code):
        self.code = code

class EmbeddedModuleLoader(Loader):
    def __init__(self, code):
        self.code = code

    def exec_module(self, module):
        exec(self.code, vars(module))

class EmbeddedModuleFinder:
    def find_spec(self, fullname, path=None, target=None):
        frame = sys._getframe(1)
        while frame.f_code.co_filename.startswith('<'):
            frame = frame.f_back # skip frames of import machineries
        if isinstance((obj := frame.f_globals.get(fullname)), EmbeddedModule):
            return spec_from_loader(fullname, EmbeddedModuleLoader(obj.code))

sys.meta_path.append(EmbeddedModuleFinder())
def build_class(func, name, *bases, metaclass=type, **kwargs):
    if EmbeddedModule in bases:
        return EmbeddedModule(func.__code__)
    return orig_build_class(func, name, *bases, metaclass=metaclass, **kwargs)

orig_build_class = builtins.__build_class__
builtins.__build_class__ = build_class

And here’s a working demo (using eager import because Python 3.15 isn’t out yet): link

1 Like

Just like a wrapper module, it would all get executed when you import a name. So to replace 2 wrapper modules, you would need 2 blocks.

Not if the name is lazily imported.

The whole point of the proposal and my interim solution is to facilitate specific code to be executed only when a lazily imported name is reified.

Yes, and to me 2 blocks of code is easier to maintain than 2 separate files.

It would be nice if the import statement can be parameterized so I can have just 1 block of code to handle all import resolutions of the same kind, but the idea of parameterized imports is out of the scope of this topic.

+1 on this specific idea. Obviously the interpreter shouldn’t add any new files, but do as if it did.

I think that way to make it would be the best. I’m not too sure about doing it within a lazy: block, but the general idea of how it works seems good.

Maybe some contextmanager-like syntax, like with lazy:would work as wanted.

As I pointed out to @Nineteendo, a generic lazy block won’t quite work because the compiler can’t really figure out statically which names in the block are supposed to be made into lazy objects, objects that are going to trigger reification upon access.

We can blindly assume that all target names in any assignment-like statements in the block are to be exported as lazy objects, but that is going to be both leaky (intermediate variables are going to be leaked as lazy objects) and incomprehensive (dynamically created names are going to be ignored).

That’s why I think an embedded wrapper module with an explicit lazy import statement may be the best compromise. I already demonstrated how it may be done with a marker class, but if we are to gain support from third-party tools, we may eventually need a dedicated syntax like this:

module any_re:
    try:
        import regex as re
    except ImportError:
        import re

lazy from any_re import re

A context manager can’t stop the block from executing eagerly (two stacked context managers can, however, with the inner one producing an exception upon entrance and the outer one capturing it, making the syntax unglier). And then because it doesn’t have the body as a code object, it has to employ an unreliable reparsing of the source lines of the body (unreliable because the source doesn’t always exist) and recompiling the body as code, incurring more overhead.

The with lazy would not be a classical contextmanager, if it could even be called one. You can also not do with lazy as …. The reason I said its a contextmanager is because they share some syntax. If the AST would be able to make that into a kind of section lazy, then just define names globally, that would make any attempt to access attribute for some name reify the import.

with lazy:
    try:
        lazy import x as y
    except ImportError:
        lazy import y

# For now, we only have globals= {..., "y": LazyModuleType(name="y", order=("x", "y")), ...}
y.attr # Execute the block, trying to find 'x', if that doesn't work, 'y'.

Basically the block doesn’t execute anything at all, except to reserve all names that we import (or import something as), unless they are redefined.

Normal rules about deletion, reassignment and so on obviously still apply here, y is just an object in the global namespace.

How is this information created?

The fact is that your approach is bogus, as Ben already pointed out:

That’s the key part that you have just completely ignored here.

1 Like

On second thought, the name of the embedded module, and even the embedded module object itself, are redundant, if we have a dedicated syntax.

To improve upon @Nineteendo’s syntax by making exported names explicit, we can do something like:

lazy import re from:
    try:
        import regex as re
    except ImportError:
        import re

This frenzy with lazy imports is exactly what PEP 810 tries to avoid. Not every import needs to be lazy.re/regex is not actually a good example to demonstrate lazy import as re is so pervasively used by other standard library modules. The same goes with arrow / datetime.

Specifically with re / regex / pcre2 , the point of using alterenative implementations is to use better regular expression syntax. You want to know which module will be used so that you can make the choice early:

try:
    import pcre2 as re
    pattern = R"PCRE syntax, yay"
except ImportError:
    try:
        import regex as re
        pattern = R"PyPI regex syntax"
    except ImportError:
        import re
        pattern = R"Least performant builtin re syntax"

def foo(bar):
    # I don't have to worry about which RE engine uses which pattern
    m = re.match(pattern, bar)
    if m is not None:
        ...

The use of if find_spec("regex"): would separate the presence checking from the actual loading (at which point you need to check for its presence again) and the general consensus of the programming community for the past several decades has been to advise against such practice. (What if the pcre2 / regex module is installed but is actually incompatible, say wrong architecture or free-threading, thus would raise ImportError when actually used?)

PEP 810 lazy import is mainly for cases where you expect the “happy path” to be taken, i.e. you expect the module to be generally available, just that you want to defer actually importing it for startup time reasons.

As soon as we move beyond that motivation we’re not talking about lazy imports, but a general lazy execution mechanism for global identifiers:

# Whenever the global identifier "re" or "pattern" is first used,
# the "lazy re, pattern" suite is executed.
lazy re, pattern:    # All lazy global identifiers
    try:
        import pcre2 as re
        pattern = R"PCRE syntax, yay"
    except ImportError:
        try:
            import regex as re
            pattern = R"PyPI regex syntax"
        except ImportError:
            import re
            pattern = R"Least performant builtin re syntax"

def foo(bar):
    # I don't have to worry about which RE engine uses which pattern
    m = re.match(pattern, bar)
    if m is not None:
        ...

This behaves like:

def __execute_lazy__():
    global re, pattern
    try:
        import pcre2 as re
        pattern = R"PCRE syntax, yay"
    except ImportError:
        try:
            import regex as re
            pattern = R"PyPI regex syntax"
        except ImportError:
            import re
            pattern = R"Least performant builtin re syntax"
    globals()["__execute_lazy__"] = lambda: None


def foo(bar):
    __execute_lazy__()
    m = re.match(pattern, bar)
    if m is not None:
        ...

and has nothing to do with the lazy package import mechanism whatsoever. I won’t discuss this further as the above example shows that implementing this behavior simply doesn’t require any new syntax.

2 Likes