I met with such a situation that a singleton object implemented by module is instantiated twice when imported in two different other module files. I suspected that it was caused by the uses of two kinds of import statement and therefore making the Python interpreter loaded the module twice, hence two instances of a singleton object.
Here is a simplified version of environment / structure to reproduce:
if __name__ == "__main__":
a_list = [1, 2, 3]
__init__.py, in order to directly import modA
from . import modA
Invoking main.py in project directory would give such output
You can see that the two memory addresses of a_list are different.
So, modA is imported using absolute as well as relative importing statements. If you may ask why I was doing that, it’s because I learnt and wanted to try relative importing in the middle of development while the project started off using only absolute importing.
In modC.py, by printing sys.modules I can confirm that package.modA and modA co-exist, showing that the same module was loaded twice.
It’s “something else”. By injecting the package directory into the path, you’ve made the source files available both as standalone modules and as package modules. Your two import statements are acting differently because of that.
The import syntax in modB is incorrect for a module that is in a package and is trying to import another module from the same package, and it will probably start failing when you remove the path injection from __init__.py.
Modules are cached in sys.modules when loaded, but the cache keys are according to the fully qualified name, as you found.
When modB is imported, because it used absolute import for modA, it has no way to know that modA should be part of the same package. Dots in import paths are symbolic; they represent package structure, not folder hierarchy. So when modB’s absolute import is found in the hacked sys.path, it creates an ordinary modA module from the modA.py source code, not inside any package - because the import statement doesn’t indicate that it should be in a package.
Then, when modC is imported, it will use relative import for modA. This means that the existingpackage module (packages are also modules, that get imported and cached in sys.modules) will be directly asked for the path to find the modA.py source code, without checking sys.path. The import system knows that this is inside a package, with a fully qualified name of package.modA, again because of the import statement itself. So it can’t use the cached module, because it has the wrong name (modA instead of package.modA), will reload the module, and cache it with the correct name.
This is not much different from the issue where people have a circular (possibly deferred) import of the __main__ module and get two copies: one named __main__ (usually this does not end up in sys.modules) and one named according to the source file name.
Please consistently use relative imports within the package hierarchy. Use absolute imports to access stuff from the project’s dependencies (including the standard library). The structure that you have, where the driver script makes a single absolute import to something within the package, and everything else is relative from there, is a standard practice. It is almost never necessary to hack sys.path this way. (Many popular third-party Python packages do not use it at all, or just once for some special case, even for hundreds of thousands of lines of code.) In development, you just start Python in the default way, from the driver’s folder, and it puts that folder on the path, so the package is visible for the absolute import. When installed (including in a virtual environment for testing), the package is put in the appropriate site-packages folder, which is already on the path (due to the startup work done by site.py etc.).
I’m not sure if this is a practical case for arrangement, but I can imagine that one may want to collect all related code in one place while other dependencies in another place, even though those dependencies are developed by themselves, for clarity purpose maybe. In this case, how to make importing system work well?
sys.path.insert(0, os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
The hack of inserting sys.path is used here. But test is probably the special case where such hacking is acceptable. As for my original case app, we can also use the same approach, or should we re-construct the architecture?
Run the driver script from the root directory (so that when the current path is added automatically to sys.path, it’s the correct one), specifying it as app/run.py. (This means that run will not be able to import some_file either way: it’s not on the path for an absolute import, and the app folder isn’t a package for a relative import.)
Use the PYTHONPATH environment variable to have paths added to sys.path at startup, rather than doing it through your code. This way, you don’t have to worry about it in production (in which case the package is installed, so the extra path entry isn’t desirable). Starting in 3.11, you can also use the -P command-line option for Python to prevent the default path from being added to sys.path. (This is in case you want to avoidimport some_file accidentally working.)
Create a virtual environment for each project, such that the package is always “installed” even as you are developing it. (You will likely want to use the -e option for Pip when installing to the virtual environment.) This is my preference, but it’s not the easiest thing for people who aren’t used to “being developers”.
Create a driver that’s part of the package, by putting an if __name__ == '__main__': block inside an appropriate module of the package (let’s say it’s modB.py) and running it (still from the root directory) as a package: python -m package.modB (notice we give a fully qualified module name, not a filename/path). This uses the standard library runpy module behind the scenes. The driver code will not run when modB is imported normally (that’s the point of the if statement). You can also make the entire package runnable this way (python -m package) by giving it a __main__.py (this does not need an if statement, because it will always be true ).
In general, after typing “import X”, you can access the module as X:
>>> import math
<module 'math' from '/usr/local/lib/python3.12/lib-dynload/math.cpython-312-x86_64-linux-gnu.so'>
>>> import os.path
<module 'posixpath' (frozen)>
>>> import importlib.resources.abc
<module 'importlib.resources.abc' from '/usr/local/lib/python3.12/importlib/resources/abc.py'>
That wouldn’t make sense with a relative import:
.modA # SyntaxError
Hence the need to fetch a particular thing from it. I suppose import .modA as spam should theoretically be possible, but most commonly, you’d want import .modA as modA which can be written from . import modA anyway.
I think there maybe something wrong. Say, we are now in the project root, command python3 app/run.py will automatically add the directory in which run.py resides, a.k.a $project/app to sys.path. In this way, run.py does be able to import some_file.
$ ls app
$ python3 app/run.py
print from some_file.py
In a configuration like that, the best way to ensure that the program can find the package is to actually install everything, as opposed to relying on relative directory paths. An editable install using pip install -e . in the top level directory, with a suitable pyproject.toml (or setup.cfg/setup.py), would ensure that modA can be imported using import package.modA, regardless of the location of the code executing the import statement.