I have a question regarding importing stuff from a __init__.py file.
When I have a main file and import everything from a module named mymodule.py there into the current namespace with
from mymodule import *
then I have all symbols from the module available in the current namespace. If mymodule.py contains a function named myfunction then I can access it with myfunction() but not with mymodule.myfunction().
I observed a different behavior when I do the same kind of import from a __init__.py file which is part of a package. If I do
import mypackage
in the main file and
from mymodule import *
in the __init__.py file of mypackage then I can access myfunction in the main file with mypackage.mymodule.myfunction().
It’s not clear to me why this is possible. I had the expectation that I need to write mypackage.myfunction() (which is also possible).
Can somebody explain me what’s happening here? Thanks.
Indeed, mymodule.py is inside mypackage. But I don’t understand why myfunction is available via mypackage.mymodule.myfunction() since I thought it’s directly imported into the namespace of the package via from mymodule import *.
You are right, importing is about putting “things” into a namespace. But why would it be necessary to shut off a different way to import/use a name once you have used one way?
if you do from mymodule import * you are putting all items from mymodules namespace into your packages namespace. That means if you want to stop other imports or directly executing the function, you have to look that up on every call into the namespace: wasteful. Now you can just do it more than one way.
as you can read in this documentation the first step in the search is to look in sys.modules, so once you have imported a module once, it will be found very quickly, preventing performance degradation from more than once finding it on disk (plus preventing some hard to find bugs in your program )
lots of Python programmers frown upon from mymodule import *. Only import what you need to import, making naming collisions less likely. Disclosure: I am one of those that reserve import * to special situations.
In that case, mymodule is a submodule of mypackage and therefore ends up in mypackage’s namespace (that’s just how it works).
If you don’t like the mypackage.mymodule.myfunction() vs mypackage.myfunction() ambiguity (which I agree is bad and I wish all package authors felt the same way) then privatizemymodule by renaming it to _mymodule.py. You’ll still be able to access the function via mypackage._mymodule.myfunction() but that underscore prefix is a signal to downstream users and to IDEs, code completion and linters that it shouldn’t be used, which will allow you to restructure the internal layout of your package without completely destroying downstream code.
Thank you very much for your answers which gave me the technical reason for this different behavior. I think I got it now, I wasn’t aware of the following rule for submodules:
When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in __import__()) a binding is placed in the parent module’s namespace to the submodule object. For example, if package spam has a submodule foo, after importing spam.foo, spam will have an attribute foo which is bound to the submodule.