I often do interactive data analysis in notebooks, using tools that I am also developing. Naturally, I often find a bug or missing feature in one of my functions, right at the very end of the analysis.
After fixing myfunction, say, I want to avoid running everything again (since that might take a very long time) and therefore turn to importlib.reload:
from mymodule import myfunction
I think it would be nice if that was just myfunction = importlib.reload(myfunction).
Currently reload() refuses anything that’s not a module, but why not have it look up the __module__ and __qualname__ when given an object, and automate the import whenever possible?
My initial thought was that this was an interesting idea worth pursuing, but alas there’s a problem:
In general, reload cannot tell what name to import from the module. The best it can do is guess, which is risky.
reload receives the myfunctionobject as its argument, not the name. Depending on what the object is, it may or may not have a __module__ attribute. If it does, then reload could reload that module. But then it’s stuck: how can it determine which name to import?
The name “myfunction” is not accessible to reload. The best it could do is inspect the object for a __name__ attribute, and guess that importing that name will Do What You Mean. But this is fragile and error-prone, and relies on implementation details of myfunction.
Like all DWIM systems when it goes wrong is will lead to problems, in this case returning the wrong object.
Such guessing functions are best left for your own personal toolkit, where you have nobody to blame but yourself if it returns the wrong object, rather than parts of the language.
I don’t think this is fundamentally different from what it’s already doing when it reloads a module: it looks for a specially-named attribute (__name__), which is a string specifying a module name, and guesses that importing that name will re-create a module that is conceptually the same as the module that was passed in.
That can, in principle, be defeated: create and import a module; then manipulate sys.path such that a different .py file with the same name will be found first; then modify the original module and attempt to reload it. Instead of seeing the changes to the original code, the module gets entirely replaced with the other one that was found instead.
It’s true that the __name__ of a function might not match the variable name passed to importlib.reload - but this happens because the function was aliased locally. The original __name__ value should, clearly, be used - it’s not as if anyone is in the habit of reassigning that (although they can, and should bear the consequences).
It’s also true - as I pointed out earlier - that not everything has a __name__, and that import syntax allows for “importing” any arbitrary attribute from a module, which might have any arbitrary type. However, I think catching the resulting AttributeError and converting it to an ImportError ought to be enough for these circumstances. “You can’t always get a meaningful result” isn’t a reason for not, pardon the pun, trying to implement some functionality.
However, there is another complication here. As I said, the import syntax allows for “importing” any arbitrary attribute from a module, which might have any arbitrary type. Including, you know, module. Which is how importing a module from a package works: packages are modules, and a module in a package is an attribute of that package.
That would cause an ambiguity, or at least an inconsistency, with the proposal. Suppose we previously did from foo import bar, and then attempt importlib.reload(bar). If we first check whether bar is a module (like with the current code), we would simply re-load the bar module directly (and reassign it as an attribute of the foo package). However, if barisn’t a module, we would necessarily have to reload foo; and some might therefore expect foo to be reloaded even ifbar is a module.
“Explicit is better than implicit”, and “special cases aren’t special enough to break the rules”. it makes more sense to have code that’s clear and consistent about what needs to be imported.
Regarding the original example:
In fact, we almost have it already: myfunction = importlib.reload(mymodule).myfunction. I think that’s probably the best option here: it’s clear what’s going on, and it avoids using an extra import statement after the code has already been imported, simply to bind a name.
Aside: I don’t consider this to be “defeating” it. It’s the correct behaviour of importing the name.
>>> import random
Oops I shadowed random.py
>>> import importlib, os
<module 'random' from '/usr/local/lib/python3.12/random.py'>
A feature of “reload this function” would need to be aware of func.__wrapped__ to be able to properly cope with decorated functions, and would have a huge number of assumptions (for example, random.randrange is actually a bound method from the Random object, and reloading it has to assume that the name has been maintained, which is usually the case). I’m dubious as to how useful it would be though, because of this problem:
from random import randrange, sample
assert randrange.__self__ is sample.__self__ # or any other proof that they're from the same module
sample = importlib.reload(random).sample
print(randrange.__self__ is sample.__self__)
So unless you ONLY imported a single name from the module, it could be very very confusing, since some names will (presumably) still be from the old module.
I understand @steven.daprano mentioned that reload(function) has to reload function.__module__ first and find the function which has __name__. But the function.__name__ is fragile (e.g. when decorated, or intentionally renamed) and you cannot always deduce the function object from the name.
For my use case, I often use the following method:
require('mymodule'); from mymodule import myfunction
from importlib import import_module, reload
if name in sys.modules: