Can `importlib.reload` work on objects?

ntessore · March 23, 2023, 10:14pm

I often do interactive data analysis in notebooks, using tools that I am also developing. Naturally, I often find a bug or missing feature in one of my functions, right at the very end of the analysis.

After fixing myfunction, say, I want to avoid running everything again (since that might take a very long time) and therefore turn to importlib.reload:

import mymodule
importlib.reload(mymodule)
from mymodule import myfunction

I think it would be nice if that was just myfunction = importlib.reload(myfunction).

Currently reload() refuses anything that’s not a module, but why not have it look up the __module__ and __qualname__ when given an object, and automate the import whenever possible?

NeilGirdhar · March 23, 2023, 11:11pm

Good idea, but why not go one step further and add a magic %reload to the notebook code so that you can do %reload(myfunction)?

kknechtel · March 24, 2023, 4:31am

Keep in mind that from x import y works with any attribute y, not just functions (or classes). Some of those won’t have those attributes, either.

ntessore · March 25, 2023, 9:44am

Sure, that’s a good idea for added convenience in Jupiter notebooks! But my point stands that reload() could be made to understand more than just modules.

NeilGirdhar · March 25, 2023, 9:56am

Yeah, but you’re broadening and complicating the interface. Since your argument was to improve the notebook experience, I think the easiest place to do that is in the notebook code.

steven.daprano · March 26, 2023, 6:01am

My initial thought was that this was an interesting idea worth pursuing, but alas there’s a problem:

In general, reload cannot tell what name to import from the module. The best it can do is guess, which is risky.

reload receives the myfunction object as its argument, not the name. Depending on what the object is, it may or may not have a __module__ attribute. If it does, then reload could reload that module. But then it’s stuck: how can it determine which name to import?

The name “myfunction” is not accessible to reload. The best it could do is inspect the object for a __name__ attribute, and guess that importing that name will Do What You Mean. But this is fragile and error-prone, and relies on implementation details of myfunction.

Like all DWIM systems when it goes wrong is will lead to problems, in this case returning the wrong object.

Such guessing functions are best left for your own personal toolkit, where you have nobody to blame but yourself if it returns the wrong object, rather than parts of the language.

komoto48g · March 26, 2023, 7:54am

How about the new syntax?

# script.py
>>> from mymodule import* myfunction

The import* will act as a regular import and also works like reload.
It would be also helpful when the “script.py” is reloaded and the dependent modules such as “mymodule” should be reloaded too.

ntessore · March 26, 2023, 8:06am

This is not any more risky than other parts for which the name dunders are already used, e.g. pickling a function.

kknechtel · March 27, 2023, 3:09am

I don’t think this is fundamentally different from what it’s already doing when it reloads a module: it looks for a specially-named attribute (__name__), which is a string specifying a module name, and guesses that importing that name will re-create a module that is conceptually the same as the module that was passed in.

That can, in principle, be defeated: create and import a module; then manipulate sys.path such that a different .py file with the same name will be found first; then modify the original module and attempt to reload it. Instead of seeing the changes to the original code, the module gets entirely replaced with the other one that was found instead.

It’s true that the __name__ of a function might not match the variable name passed to importlib.reload - but this happens because the function was aliased locally. The original __name__ value should, clearly, be used - it’s not as if anyone is in the habit of reassigning that (although they can, and should bear the consequences).

It’s also true - as I pointed out earlier - that not everything has a __name__, and that import syntax allows for “importing” any arbitrary attribute from a module, which might have any arbitrary type. However, I think catching the resulting AttributeError and converting it to an ImportError ought to be enough for these circumstances. “You can’t always get a meaningful result” isn’t a reason for not, pardon the pun, trying to implement some functionality.

However, there is another complication here. As I said, the import syntax allows for “importing” any arbitrary attribute from a module, which might have any arbitrary type. Including, you know, module. Which is how importing a module from a package works: packages are modules, and a module in a package is an attribute of that package.

That would cause an ambiguity, or at least an inconsistency, with the proposal. Suppose we previously did from foo import bar, and then attempt importlib.reload(bar). If we first check whether bar is a module (like with the current code), we would simply re-load the bar module directly (and reassign it as an attribute of the foo package). However, if bar isn’t a module, we would necessarily have to reload foo; and some might therefore expect foo to be reloaded even if bar is a module.

“Explicit is better than implicit”, and “special cases aren’t special enough to break the rules”. it makes more sense to have code that’s clear and consistent about what needs to be imported.

Regarding the original example:

In fact, we almost have it already: myfunction = importlib.reload(mymodule).myfunction. I think that’s probably the best option here: it’s clear what’s going on, and it avoids using an extra import statement after the code has already been imported, simply to bind a name.

It does repeat the myfunction name still, but that’s a separate proposal…

Rosuav · March 27, 2023, 3:31am

Aside: I don’t consider this to be “defeating” it. It’s the correct behaviour of importing the name.

>>> import random
Oops I shadowed random.py
>>> import importlib, os
>>> os.unlink("random.py")
>>> importlib.reload(random)
<module 'random' from '/usr/local/lib/python3.12/random.py'>

A feature of “reload this function” would need to be aware of func.__wrapped__ to be able to properly cope with decorated functions, and would have a huge number of assumptions (for example, random.randrange is actually a bound method from the Random object, and reloading it has to assume that the name has been maintained, which is usually the case). I’m dubious as to how useful it would be though, because of this problem:

from random import randrange, sample
assert randrange.__self__ is sample.__self__ # or any other proof that they're from the same module
sample = importlib.reload(random).sample
print(randrange.__self__ is sample.__self__)

So unless you ONLY imported a single name from the module, it could be very very confusing, since some names will (presumably) still be from the old module.

ntessore · March 27, 2023, 6:40am

You might also need to import mymodule first. Moving that extra (tiny) bit of typing into reload() is all that I am proposing.

PS: And you might need to inspect __module__ yourself if myfunction was imported into mymodule in the first place.

That’s a criticism of using reload() generally.

Rosuav · March 27, 2023, 6:49am

True, but if you’ve only ever used import modulename, they’ll all update simultaneously. So this would be another thing to keep track of.

I’ll be honest, though: I have literally NEVER used importlib.reload in any useful way. When I want hot reloading capabilities, I usually build my own, not using the import system at all.

steven.daprano · March 27, 2023, 7:41am

Oh my, I completely forgot you could do that. That’s brilliant! Thank you.

komoto48g · March 29, 2023, 3:55pm

And you have to import importlib too.

I understand @steven.daprano mentioned that reload(function) has to reload function.__module__ first and find the function which has __name__. But the function.__name__ is fragile (e.g. when decorated, or intentionally renamed) and you cannot always deduce the function object from the name.

For my use case, I often use the following method:

require('mymodule'); from mymodule import myfunction

where

def require(name):
    from importlib import import_module, reload
    if name in sys.modules:
        return reload(sys.modules[name])
    return import_module(name)