Creating aliases in sys.modules for deprecated modules

TLDR; I would like to get some feedback around whether my approach for deprecating submodules is sound or not. The approach is to wrap all the MetaPathFinders in a DeprecatingModuleFinder so that for old module references they create a ModuleSpec by modifying the new submodule’s ModuleSpec. The modification of the ModuleSpec involves specifying a DeprecatingModuleLoader which ensures that the sys.modules module cache contains the right ModuleSpec for both the old and new module reference, maintaining thus a “mirror” of the module hierarchy.

The problem

Internally in our library we provide convenient decorators to deprecate functions, parameters, classes and module attributes with simple decorators. These decorators take care of sending the DeprecationWarnings and provide backward compatible bridging to the new function/parameter/class/attribute when possible.

We want to introduce the same for modules.

Requirements

For example:

We want to rename old_parent.child1 to new_parent.child2.

  1. User code should still execute while calling any of:
    1. import old_parent.child1
    2. from old_parent import child1
    3. import old_parent; assert old_parent.child1 – this assumes that old_parent/__init__.py originally had imported child1 as an attribute: import .child1
  2. send DeprecationWarning when the above usage is detected in user code

Pseudo code

This is the skeleton of the process:

  1. wrap the default Finders in sys.meta_path
def wrap(finder: Any) -> Any:
   if not hasattr(finder, 'find_spec'):
       return finder
   return DeprecatedModuleFinder(finder, module_name, alias)

sys.meta_path = [wrap(finder) for finder in sys.meta_path]
  1. For non-deprecated modules the DeprecatedModuleFinder simply delegates to its finder. For the deprecated modules, it reports a deprecation warning and returns the non-deprecated version of the spec with a loader that is wrapped in DeprecatedModuleLoader
# change back the name to the deprecated module name
spec.name = fullname
spec.loader = DeprecatedModuleLoader(spec.loader, fullname, new_fullname)
  1. The DeprecatedModuleLoader wraps the original loader’s exec_module method to ensure that both the deprecated module reference and the new place of the module is in sys.modules, pointing to the same ModuleSpec instance:
# check for new_module whether it was loaded
if self.new_module_name in sys.modules:
   # found it - no need to load the module again
   sys.modules[self.old_module_name] = sys.modules[self.new_module_name]
   return
# now we know we have to initialize the module
sys.modules[self.old_module_name] = module
sys.modules[self.new_module_name] = module

try:
   return method(module)
except Exception as ex:
   # if there's an error, we atomically remove both
   del sys.modules[self.new_module_name]
   del sys.modules[self.old_module_name]
   raise ex

  1. When attribute handling is requested, the module is set as a deprecated attribute on the old parent with the old child name.

I did implement this and things seem to be working quite well across 3.6-3.9 (and even for older Loaders that define load_module). The only caveat I’ve seen for now is that this kind of “aliasing” is not resolved by PyCharm or other editors - which is kind of okay, as it incentivizes users to switch over to the new module structure.

Question 1: However, I’m not an importlib machinery expert and would love to get some feedback whether I’m doing something fundamentally wrong here, or if this is a sound approach.

Question 2: If this is a valid approach I wonder if extracting this to a separate open source library would be valuable for the community to help with module deprecations.

Could you give an example of how one would actually go about deprecating a module in your system?

I’m struggling to understand what this would actually look like in practice, and what the benefit would be over keeping child1.py around with three(-ish) lines like

from ..util import warn_deprecated_module()
warn_deprecated_module() # emits a deprecation warning (and that's all)
from ..child2 import *

I realize that this is a little bit hard to address in the abstract / I could do a better job explaining the key points. If you are willing to look at the PR for the actual code, it is here: Module deprecator by balopat · Pull Request #3917 · quantumlib/Cirq · GitHub

I’ll try to address your points:

The actual usage would look like this. In old_parent/__init__.py you would add something like:

from mylib import _compat 
_compat.deprecated_submodule(
    new_module_name="mylib.new_child",
    old_parent="old_parent",
    old_child="old_child",
    deadline="v0.20",
    create_attribute=True,
)

The solution in your reply is simple, but doesn’t fulfill the requirements - in particular you won’t be able to import submodules of the child module, and submodules of those modules…because from ..child2 import * will only create a reference but not a true alias i.e. it won’t create an entry in sys.modules. This means that if child2 has a submodule sub, this user code will fail: from old_parent.child1 import sub , the error is ImportError: cannot import name 'sub'. This is similar behavior to

import numpy as np
from np import linalg

Fails with ImportError, however

import numpy as np 
sys.modules['np'] = np
from np import linalg

Succeeds!

Thanks for clarifying that. I don’t see any fundamental problems with your approach, but I clearly know a lot less about the import machinery than you do so that doesn’t mean much

1 Like

Thanks for having a look at it!