Sharing code in python without quirks

Hi,
Seems that there is a kind of a big annoying issue in python for many years, which I assume brings many people and many, many opinions… but is frankly very simple:
In any other programming language when you want to import code from another sister project you simply import that code from any relative or absolute path. No one tells you that your code is not pythonian enough.
Think database ORM model sharing between two projects that are simply not under the same folder but share the same database and need to access the same ORM code. The two projects exist in separate folders which aren’t necessarily related (or indeed necessarily unrelated), and now it is needed to access that shared ORM code from the other project.
As of today, without employing “funny business” like: packaging/building/publishing (?!) the shared code after every change, or git/submodules, or linking folders (in linux) - all of which might seems reasonable some people, but are a pain for many of us (as is reflected in any coding forum out there) there is no way to import/share code between projects.
In any other major language: C++, Java, JS, TS, … code importing is easy business, and for some awkward reason, in just the most easiest/fun language of all Python, code sharing is a nightmare. It really is.
Is there any PEP to wait for? Did I miss it, and in python 3.12 there is some magic trick?

You can just add the other folder to sys.path. Or use importlib and import the module/package by absolute file path.

I tried adding sys.path and the IDE’s assistant/code completion wont work.
Using importlib is far from being straightforward (see python - How can I import a module dynamically given the full path? - Stack Overflow) and albeit I did not try this method, I suspect the IDE won’t play with it as well.
I finally just created a symlink to the other project (which is feasible in linux, but not sure if also in windows these days).
This should be available with simple import. I really don’t get why is it not built in to the language…

Aha, so your actual problem isn’t that python can’t do this (there are multiple ways to do that), but that your IDE can’t. Maybe setting PYTHONPATH is something the ide will respect, otherwise you gotta read the documention of your IDE and/or make a FR with your IDE’s dev.

1 Like

First, the reference material:

I assume that you mean a relative or absolute file system path.

The reason Python’s import statements don’t work like that is because the name is a symbol. It describes a module to import, not a file.

Aside from looking in the file system, Python might look for compiled C modules within itself (for built-in modules like sys), or for “frozen” modules that are similarly built-in but stored as Python bytecode.[1] And on Windows, it could look in the registry.

If the import system does have to import from the filesystem, its source could be:

  • a Python source code (.py) file or compiled bytecode (.pyc) file
  • compiled object code for a C extension, with a platform-specific extension (perhaps .pyd or .dll on Windows; perhaps .so on Linux, etc.)
  • a folder (not the files in the folder, but the folder itself: packages are modules, represented by the same type, not a subtype)
  • any of the above, as a “file” within the virtual filesystem of a zip archive
  • from a “framework”, on a Mac

And that’s just the default set of options. There are hooks that allow the user to define new ways to import things.

No, you don’t need to do any of those things. You just need to tell Python where the top-level package is for the other project. There are multiple ways to do this - and if it seems like people are reluctant to talk about them, it’s because they used to be hideously abused for imports within the same project by people who didn’t want to learn the basics about how relative imports work, or by people who wanted to share code within an organization and thought they could tell their coworkers how to organize their projects on disk better than a packaging system could.

For absolute imports, the important thing is what’s on sys.path at the time the import happens. (Relative imports don’t make sense between projects unless you have explicitly designed around namespace packages, and then you still need to set up sys.path in a similar way.) The documentation shows us what’s on there by default:

which boils down to:

  • usually, a “local” folder which depends on how you started Python, but tries to do the right thing
  • directories that you explicitly request using the PYTHONPATH environment variable
  • the standard library
  • third-party libraries (installed via Pip and/or provided with Python, for example if it comes as part of a Linux distribution)

You can also modify the sys.path list at runtime (although you should practically never need to) and the changes affect any imports that happen after that.

You did miss it, and it’s existed for as far as I’m aware the entire history of Python. Certainly it existed in 2005 or so when I started using Python seriously.


I’m pulling this part out of order, because it’s long and not essential.

Honestly I’m baffled.

I get that some people would prefer to give an actual path to a source code file in order to import something, and there are circumstances where that makes sense. You can do that in Python, too - but you’re giving up on a lot, as described above, unless you re-build it yourself. You can also write your own ways to look for files that represent a module, and your own ways to interpret the file contents to create a module. You can write a module importer that grabs the code from the Internet on demand (although good luck with the security aspects of that). And you can look for third-party libraries that implement those things for you :wink:

But the examples you gave strike me as especially awful.

TypeScript is essentially the same language as JavaScript for these purposes; it’s built on top, to implement things that aren’t related to importing. JavaScript importing, from my own experience, is not at all what I’d call fun.

Java uses the same kind of symbolic name scheme for modules and packages that Python does, and it doesn’t even support the equivalent of Python’s relative imports. Yes, you can define what package your code belongs to, within the code, and it doesn’t get tied up in file system details - so you can have files “in another project” that are part of the same package. But that’s only possible because you compile the code and decide ahead of time what files are part of the program (i.e., you either tell the build process what’s going into your JAR, or you tell the runtime what classpaths to use). And sure, you can refer to things by fully-qualified name without needing to import them - again, that’s only possible because you set the scope ahead of time. Utterly impossible in an environment like Python’s, where you’re explicitly allowed to race the code to write the module it wants to import, before it tries to import.

As for C++, it doesn’t have code importing. It has a tool built into the compiler, called the preprocess, which you instruct to do a virtual copy-and-paste of source code from another file into your own before compilation. It inherits this legacy from C, and it’s where all the linker errors come from, and all the weird complications with #ifndef header guards etc. Sure, you can specify file paths. But if you move one of the “separate” projects to a different folder, you need to change those paths in each #include, and recompile. With symbolic names and a real import system, the module discovery all happens at runtime - and in Python, the runtime alteration is centralized (you just change sys.path and then everything can find everything else again.


  1. The best explanation I can come up with. The term “frozen module” doesn’t seem to be explained in 3.x documentation any more. ↩︎

3 Likes

This answer is textbook material. Great stuff! But, as it goes, most programmers do not read the manual - which is the great thing about Python! You usually find stuff easy to understand which is just not true about code importing in python.
And yes, the current sys.path and PYTHONPATH environment variable effect on the imports does make life harder! Especially when you find out that the import paths themselves are relative to some path beyond the current source code, so you need to restructure your project when you find out that running a script from command line is not running it in Pycharm… But this is another pain.
Without getting into answering every point, imports should be simple. Period.
Currently they are a pain in… and it’s completely out of the question to require that all projects be seated under the same folder. That is not how it works.

What is your suggestion for a portable, simple and general enough import system?

1 Like

Oh, I am not here to supply, just to complain :wink: but here are a few shots anyway:

  1. Simply allow to import by path using alternative syntax. Allow the users poison themeselves if they choose to…
  2. Some way of declaring the top level of the project from code/configuration, just to aleviate the need to declare PYTHONPATH which isn’t very friendly for new users coming from other languages. The notion that there is a flour and it might move because you ran the script from another folder is cumbersome.
    Sure there are complications with each, but the current situation were the recommended method is to package the shared folder (package!) just to share it with another project, which means recompilations, and messing with more paths, and the fact that the solution many people choose is external to the language (adding paths to sys.path or linking a folder) is just wrong.