Add "__find__" method for suggest the ModuleNotFoundError

Now that the suggestion for ModuleNotFoundError only need the last piece of the puzzle: the method for find modules.

The change based on the opinion below:

Abstract

Now that the exception message “No module named ‘xxx.yyy.zzz’” is not so helpful. In fact, the message seems that the whole module name is wrong so that the beginners maybe try to check the whole path. However, when the message is it, the only meaning is that “zzz” is not exist.

We can use another way to make the message:

#importlib/_bootstrap.py
_ERR_MSG_PREFIX = 'No module named '
_CHILD_ERR_MSG = 'module {!r} has no child module {!r}'

def _find_and_load_unlocked(name, import_):
    path = None
    parent, _, child = name.rpartition('.')
    parent_spec = None
    if parent:
        if parent not in sys.modules:
            _call_with_frames_removed(import_, parent)
        # Crazy side-effects!
        module = sys.modules.get(name)
        if module is not None:
            return module
        parent_module = sys.modules[parent]
        try:
            path = parent_module.__path__
        except AttributeError:
            #change
            msg = _CHILD_ERR_MSG.format(parent, child) + f'; {parent!r} is not a package'
            #end
            raise ModuleNotFoundError(msg, name=name) from None
        parent_spec = parent_module.__spec__
        if getattr(parent_spec, '_initializing', False):
            _call_with_frames_removed(import_, parent)
        # Crazy side-effects (again)!
        module = sys.modules.get(name)
        if module is not None:
            return module
    spec = _find_spec(name, path)
    if spec is None:
        #change
        if not parent:
            msg = f'{_ERR_MSG_PREFIX}{name!r}'
        else:
            msg = _CHILD_ERR_MSG.format(parent, child)
        #end
        raise ModuleNotFoundError(msg, name=name)
    else:
        if parent_spec:
            # Temporarily add child we are currently importing to parent's
            # _uninitialized_submodules for circular import tracking.
            parent_spec._uninitialized_submodules.append(child)
        try:
            module = _load_unlocked(spec)
        finally:
            if parent_spec:
                parent_spec._uninitialized_submodules.pop()
    if parent:
        # Set the module as an attribute on its parent.
        parent_module = sys.modules[parent]
        try:
            setattr(parent_module, child, module)
        except AttributeError:
            msg = f"Cannot set an attribute on {parent!r} for child module {child!r}"
            _warnings.warn(msg, ImportWarning)
    return module

If the message like “module ‘xxx.yyy’ has no child module ‘zzz’”, everyone can understand that the question is at “zzz”.

Code explain

To get the nearest name, we can get the finally child name and try to get all of the modules below the parent (or get all of the top modules). To get them while not import them, the code below can do it for the original finder:

import os
import sys
from importlib import machinery

def scan_dir(path):
    """
    Return all of the packages in the path without import
    contains:
      - .py file
      - directory with "__init__.py"
      - the .pyd/so file that has right ABI
    """
    if not os.path.isdir(path):
        return []

    suffixes = machinery.EXTENSION_SUFFIXES
    result = []

    for name in os.listdir(path):
        full_path = os.path.join(path, name)

        # .py file
        if name.endswith(".py") and os.path.isfile(full_path):
            modname = name[:-3]
            if modname.isidentifier():
                result.append(modname)

        # directory with "__init__.py"
        elif os.path.isdir(full_path):
            init_file = os.path.join(full_path, "__init__.py")
            if os.path.isfile(init_file) and name.isidentifier():
                result.append(name)

        # the .pyd/so file that has right ABI
        elif os.path.isfile(full_path):
            for suf in suffixes:
                if name.endswith(suf):
                    modname = name[:-len(suf)]
                    if modname.isidentifier():
                        result.append(modname)
                    break

    return sorted(result)

The function can find all module below the given path without importing them.
For the normal conidion, we can scan all the path in sys.path and get them to get all of the top module. The name list plus the sys.builtin_module_names is all of the top module names.

Except the module “os” I consider that no builtin module has its submodule, and after import os we can already use “os.path”, so it won’t be suggested.

For the other top modules, firstly we can find that where the module is and whether it is a folder and then scan the module below:

def find_in_path(name):
    if not name:
        return []
    if name in sys.modules: # If the parent module has exist.
        if not hasattr(sys.modules[name], '__path__'):
            return []
        return sum([scan_dir(i) for i in sys.modules[name].__path__], [])
        
    name_list = name.split(".")
    for i in sys.path:
        list_d = scan_dir(i)
        if name_list[0] in list_d:
            path = i
            break
    else:
        return []
    for j in name_list:
        path = os.path.join(path, j)
    if not os.path.isdir(path):
        return []
    if not os.path.exists(os.path.join(path, "__init__.py")):
        return []
    return scan_dir(path)

Due to that all of the frozen modules when compiling “python.exe” are standard modules, it doesn’t needed to be suggest again.

The questions “Do you have” and “What do you have” are different, so it needs a new method. Now it is “__find__” to reduce the comflict. The method is not mandatory, the only effection is to get the suggestion from the custom hook.

The method explanation here:

def __find__(self, name: str=None) -> list[str]:
    """
    Return all of the submodules under the module name given without import them.
    If not name it return all of the top modules
    """
    return []

If the name “__find__” is not suitable, the new name should be hardly found in the current ative projects.

Final result

To combine the change, the change in “traceback.py” here:

#TracebackException.__init__
        ...
        #elif exc_type and issubclass(exc_type, ImportError) and \ ...
        elif exc_type and issubclass(exc_type, ModuleNotFoundError) and \
                getattr(exc_value, "name", None) and \
                "None in sys.modules" not in exc_value.msg:
            wrong_name = getattr(exc_value, "name", None)
            parent, _, child = wrong_name.rpartition('.')
            top = wrong_name.partition('.')[0]
            suggestion = _compute_suggestion_error(exc_value, exc_traceback, wrong_name)
            if suggestion == child:
                self._str += ", but it appear in the final result from '__find__'. Is your code wrong?"
            elif suggestion:
                self._str += f". Did you mean: '{suggestion}'?"
            #Below is in 3.15
            if sys.flags.no_site and not parent and top not in sys.stdlib_module_names:
                if not self._str.endswith('?'):
                    self._str += "."
                self._str += (" Site initialization is disabled, did you forget to "
                + "add the site-packages directory to sys.path?")
         ...
def _compute_suggestion_error(exc_value, tb, wrong_name):
    if wrong_name is None or not isinstance(wrong_name, str):
        return None
    if isinstance(exc_value, AttributeError):
        obj = exc_value.obj
        try:
            try:
                d = dir(obj)
            except TypeError:  # Attributes are unsortable, e.g. int and str
                d = list(obj.__class__.__dict__.keys()) + list(obj.__dict__.keys())
            d = sorted([x for x in d if isinstance(x, str)])
            hide_underscored = (wrong_name[:1] != '_')
            if hide_underscored and tb is not None:
                while tb.tb_next is not None:
                    tb = tb.tb_next
                frame = tb.tb_frame
                if 'self' in frame.f_locals and frame.f_locals['self'] is obj:
                    hide_underscored = False
            if hide_underscored:
                d = [x for x in d if x[:1] != '_']
        except Exception:
            return None
    elif isinstance(exc_value, ImportError):
        if isinstance(exc_value, ModuleNotFoundError):
            return _handle_module(exc_value)
        try:
            mod = __import__(exc_value.name)
            try:
                d = dir(mod)
            except TypeError:  # Attributes are unsortable, e.g. int and str
                d = list(mod.__dict__.keys())
            d = sorted([x for x in d if isinstance(x, str)])
            if wrong_name[:1] != '_':
                d = [x for x in d if x[:1] != '_']
        except Exception:
            return None
    else:
        assert isinstance(exc_value, NameError)
        # find most recent frame
        if tb is None:
            return None
        while tb.tb_next is not None:
            tb = tb.tb_next
        frame = tb.tb_frame
        d = (
            list(frame.f_locals)
            + list(frame.f_globals)
            + list(frame.f_builtins)
        )
        d = [x for x in d if isinstance(x, str)]

        # Check first if we are in a method and the instance
        # has the wrong name as attribute
        if 'self' in frame.f_locals:
            self = frame.f_locals['self']
            try:
                has_wrong_name = hasattr(self, wrong_name)
            except Exception:
                has_wrong_name = False
            if has_wrong_name:
                return f"self.{wrong_name}"    

    suggestion = _calculate_closed_name(wrong_name, d)

    # If no direct attribute match found, check for nested attributes
    if not suggestion and isinstance(exc_value, AttributeError):
        with suppress(Exception):
            nested_suggestion = _check_for_nested_attribute(exc_value.obj, wrong_name, d)
            if nested_suggestion:
                return nested_suggestion

    return suggestion

def _calculate_closed_name(wrong_name, d):  # to independent function
    try:
        import _suggestions
    except ImportError:
        pass
    else:
        suggestion = _suggestions._generate_suggestions(d, wrong_name)
        if suggestion:
            return suggestion

    # Compute closest match

    if len(d) > _MAX_CANDIDATE_ITEMS:
        return None
    wrong_name_len = len(wrong_name)
    if wrong_name_len > _MAX_STRING_SIZE:
        return None
    best_distance = wrong_name_len
    suggestion = None
    for possible_name in d:
        if possible_name == wrong_name:
            # A missing attribute is "found". Don't suggest it (see GH-88821).
            continue
        # No more than 1/3 of the involved characters should need changed.
        max_distance = (len(possible_name) + wrong_name_len + 3) * _MOVE_COST // 6
        # Don't take matches we've already beaten.
        max_distance = min(max_distance, best_distance - 1)
        current_distance = _levenshtein_distance(wrong_name, possible_name, max_distance)
        if current_distance > max_distance:
            continue
        if not suggestion or current_distance < best_distance:
            suggestion = possible_name
            best_distance = current_distance

    return suggestion

def _handle_module(exc_value):
    if not isinstance(exc_value, ModuleNotFoundError):
        return    
    all_result = []
    parent, _, child = exc_value.name.rpartition('.')
    if len(child) > _MAX_STRING_SIZE:
        return
    suggest_list = []
    for i in sys.meta_path:
        func = getattr(i, '__find__', None)
        if callable(func):
            try:
                list_d = func(parent)
                if list_d:
                    suggest_list.append(list_d)
            except:
                pass
    if not parent:
        for paths in sys.path:
            suggestion_d = scan_dir(paths)
            if suggestion_d:
                suggest_list.append(suggestion_d)
    else:
        suggest_list.append(find_in_path(parent))
    for i in suggest_list:
        if child in i:
            return child
        result = _calculate_closed_name(child, i)
        if result:
            all_result.append(result)
    return _calculate_closed_name(child, sorted(all_result))

Q&A

  • Is it need a cache to speed up?
    No, it isn’t. In the current test, in normal situation it fast. If it becomes costly, maybe user download many packages, add too many paths in sys.path or wrote the method “__find__” wrong. And the helpfulness is more important than speed.

  • Is it repeat what IDE do?
    No . Now the IDE like pycharm and vscode can suggest the module name when input letter by letter but can’t when copy, and the suggestion runtime is better than the static analysis. The question is like that “We already have map app, why it needs to build road signs to indicate”.
    If it is needed to discuss, IDEs can give the suggestion for the syntax, name and attribute, why python suggests them?

  • Is it impact many code now?
    No . In python test it only affect the test for “import unittest.asdfsdsadas” and it is experted. For many code, there is no problem even though the custom hook has no method “__find__”.

I think this belongs in Ideas.

2 Likes

Now the monkey patch is prepared. It is in the module “friendly_module_not_found_error”. Now it not only change the traceback but also change the original error message in import to help people better to understand the message.

It has supported for python -m <module> here:


It didn’t check whether the “__main__.py” under the package. But I think that it can finally.

Yesterday I make the module support to 3.7+. Now it work in many ways:

PS C:\path>python
python 3.x.x (tags/v3.x.x:xxxxxxx, xxx xx xxxx, xx:xx:xx) [MSC v.1944 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import abs
Traceback (most recent call last):
  File "<python-input-0>", line 1, in <module>
    import abs
ModuleNotFoundError: No module named 'abs'. Did you mean: 'abc'?
>>> exit
PS C:\path>python -m test.test_api
C:\path\to\python.exe: module 'test' has no child module 'test_api'. Did you mean: 'test_capi'?
PS C:\path>python -m asd
C:\path\to\python.exe: No module named 'asd'. Did you mean: 'ast'?