A hook to allow adding notes to ModuleNotFoundError

In Pyodide we’ve unvendored or removed some standard library modules, in the former case to reduce download size, in the latter case because they cannot work. We have extra information about why the module is missing and we would like to add this information to the ModuleNotFoundError that is raised. Currently we add an extra MetaPathFinder to sys.metapath that raises a custom ModuleNotFoundError but this is not correct. It would be nice to be able to set a hook, something like:

def annotate_module_not_found_error(name, error):
   if name in REMOVED_STDLIB_MODULES:
      error.add_node(f"The module '{name}' has been removed from the standard library due to browser limitations")

importlib.add_module_not_found_error_hook(annotate_module_not_found_error)

import turtle 

which should raise an error with our extra annotation:

ModuleNotFoundError: No module named 'turtle'
The module 'turtle' has been removed from the standard library due to browser limitations

Of course we can just patch importlib to add this feature so we don’t need it upstream but I thought it was an interesting idea.

2 Likes

See also:
[[WASM] Unvendoring some of stdlib modules]

ryanking13 pointed out a related pep PEP 534 – Improved Errors for Missing Standard Library Modules | peps.python.org

This is an interesting idea which would likely be useful to other Python distribution that customize the list of modules included. If it is not supported, an alternative might be to use friendly-traceback as the default sys.excepthook. I would definitely provide support in friendly-traceback for such a customization of this particular error message.

In Pyodide’s case, we never call sys.excepthook. Instead, if a Python exception reaches the top of the Python call stack we translate it into a JavaScript PythonError and throw it into JavaScript so that JavaScript code has an opportunity to catch and handle it. So in our use case, I would prefer to modify the exception where it is generated. This is probably pretty unusual to Pyodide though.

who adds the source lines to the traceback? Python or JS?
If it’s JS, my idea of fetching source lines on demand suddenly become doable, since you can easily await fetch() it before displaying the tb

who adds the source lines to the traceback? Python or JS?

Well this is off topic… But here is an answer.

The source lines are added the traceback is formatted, either by _Py_DisplaySourceLine (if we use the C exception formatting apis) or by FrameSummary.line (if we use the traceback module which is implemented in Python). Pyodide calls the traceback module to format exceptions in its error handling code but it also stores the exception into sys.last_value. It should be possible to walk the traceback and check for missing sources, download those sources, invalidate any relevant caches, format a new traceback, and display that if you want to.

It might be a good idea to starting lazily loading source files around when the rest of the VM is initialized so that startup time would be reduced but there’s not an awkward really long pause before displaying an error message. It depends on what the application does though.