Does inline script metadata open up listing single, local files as a dependency?

brettcannon · November 15, 2024, 12:49am

I had a co-worker make the oft-heard – at least by me – gripe today about why didn’t Python make it easier to reference some other script file somewhere outside the current directory for that one function (for which they didn’t want to make a package)? They did the usual “dump everything into a directory” solution which unblocked them, but they lamented wanting something better.

After explaining why Python’s import system doesn’t use file paths, I realized this could be solved from a packaging perspective now that we have inline script metadata. If we had a way to specify a dependency on a script by file path then tools that use inline metadata can pull the file(s) into the virtual environment for running by just copying the file or symlinking it in. I don’t know if making URL dependencies somehow work for file paths is a good idea, doing something new, or this whole idea is just bad, but since I get this whole “I want to reference modules by file path” complaint semi-regularly I wanted to at least surface the idea that packaging could potentially help solve it.

pf_moore · November 15, 2024, 8:24am

The big problem is that file URLs don’t have a means of encoding relative paths, which I am certain would be a requirement.

Why not just add the location of the needed file to sys.path at the start of the script?

DavidCEllis · November 15, 2024, 10:50am

The importlib docs do have a recipe for importing from a file path^[1].

I can’t really think of a way of adding this to inline metadata that doesn’t end up being more cumbersome to use than either using something like this or adding the folder(s) to sys.path?

Although they advise against actually using it. ↩︎

brettcannon · November 15, 2024, 6:30pm

I just get constant pushback about it. I think that happens because people who want this sort of problem solved don’t understand search paths (yet). I view this as very much a beginner problem from folks who don’t take the time to try and learn Python more thoroughly.

Yep, and I wrote it (and importlib).

I’m not coming from a “cumbersome” angle but a “I can explain it to a beginner” angle. Telling them, “list the paths to the files you want to import” seems easier to me than, “import this module named importlib, call this function and assign it to a variable, and it magically acts like an import”. But you’re right, you could totally solve this with some PyPA package that provided a function that took a file path and handled the import for you.

DavidCEllis · November 15, 2024, 8:09pm

Ah, of course. That’s always a danger responding to a core dev!

When I say ‘cumbersome’ I mostly mean that it would end up being more stuff for a beginner to do. I can’t think of a way that’s obviously less work? You have to think about both the special block formatting and the TOML format itself, and your import doesn’t work if you call py myscript.py, only if you use tool run myscript.py.

I would note that I’m coming at this as someone who has a script runner for scripts with inline metadata. It already does have a few custom tool extensions and I was trying to think how I would add this but couldn’t really think of anything that would be easier to explain and less work than:

from import_toolkit import import_script  # Some theoretical package.

stuff = import_script("path/to/stuff.py")

pf_moore · November 15, 2024, 9:05pm

Ah, right. And script metadata is useful for a beginner who (quite sensibly ) doesn’t want to learn about environments and packaging.

The trouble is, such a beginner would want the local module they are linking to installed in editable mode. They wouldn’t know that’s what they want, of course, but they’d get very confused if they changed the module and their code used the old version - and script runners typically cache environments, so that’s a real possibility.

I don’t think that technically it’s a good fit for script metadata as it stands.

But the format was deliberately designed to be extensible, so I could imagine a new section (syntax open to bikeshedding):

# /// script
# dependencies = ["requests"]
# local_imports = [
#     'path\to\some\file.py',
#     '\\network\share\lib\somelib.py',
# ]

I deliberately used Windows filenames and a network share to make the point that path quoting could be another issue for beginners, especially as TOML and Python spell literal strings differently. But maybe that’s easier to learn for the users you’re getting pushback from.

It’s still a new PEP to get it standardised, but I’d much prefer this over trying to use URLs in a way that isn’t allowed in any other place where dependency data is recorded.

brettcannon · November 15, 2024, 9:41pm

That’s a good question I don’t know if I have a good answer. The experienced programmer, occasional Python would be fine, but I don’t know about people who aren’t experienced developers.

Seems reasonable. I’m not going to worry about this, though, unless there’s more visible support for the idea.

sirosen · November 15, 2024, 9:49pm

I think this is also the sort of solution which tools, tutorials, and well-meaning but misinformed people tell new users not to use. In terms of tools, I’m thinking of lints which want your imports all at the top of the file.

In a way, those lints are correct – it’s an indicator that you’re doing something slightly abnormal, and you can ignore the warning if you’re confident that you have a good reason to do what you’re doing.
The more experience you have maintaining software – not just with Python – the easier it is to grasp when it’s safe to bypass a warning and when it is unwise. But for a beginner, any authoritative sounding guidance, especially with an example or two to “prove” that it’s a bad idea, not only misleads them but naturally leads to some discomfort when “disobeying the rule”.

It all puts me in mind of “super considered harmful” and “super considered super”.

In terms of solving the problem itself, I’m not sure that adding local path support would mesh with this via script metadata. (Remember, I want to work on that problem after putting a bow on some of the dependency group efforts.) Suppose we add support for local relative paths, and require package name as part of the dependency specifier. As toy syntax, pretend these two things work:

foolib @ ../foolib/
foolib @ /abspath/to/foolib

I’ll also include a probably controversial idea I’ve had in my head, which is to make a literal -e prefix part of the syntax for editable support:

-e foolib @ ../foolib

Tools following these paths expect to find a package which is ready to be installed.

So, what does a script runner do if the dependencies include this?

-e foolib @ /home/diva/foolib.py

Does it need to package foolib.py as a single-module package? But we asked for an editable install, and the original file wasn’t necessarily wrapped as a package. So I’m not clear on how that could work?

I intentionally showed an example of a file in a homedir because I think it’s clear that we don’t want to pick up on the whole directory via a path file.

pf_moore · November 15, 2024, 10:02pm

Indeed. To be perfectly honest, I don’t think this is actually a packaging issue. It’s more a question of the core Python import semantics, which is incredibly flexible, but which is surrounded by a set of conventions and practices that make it seem a lot stricter than it is, and which make most of the power of the import system “expert level” knowledge.

IMO the correct way of handling this is by making it easier for beginners to import files by path. Whether that’s new language syntax like import foo using "foo.py" or a builtin^[1] like foo = import_from_file("foo.py") doesn’t matter so much as making it a form of import rather than an “external dependency”.

But like Brett, I don’t intend to take this any further.

or stdlib, but the point here is to keep the bar for entry very low ↩︎

heejaechang · November 18, 2024, 9:31pm

I think it is not just about how a module is imported but how a tool can help or guide users.

The existing methods mentioned here (sys.path, importlib, or third-party packages`) all require running the script to work.

For a tool, especially a static analysis tool, it needs a way to determine the result (the user’s intention) without actually running the code.

If the tool can infer the intention, it can help users verify or fix import issues, making imports less of a burden for users.

The tool could make a best-effort heuristic attempt to understand user intentions from sys.path or other code, but this approach often causes more confusion because it forces users to remember more rules (what works and what doesn’t). Additionally, each tool might have different heuristics, adding to the complexity.

It would be nice if Python provided a way to import modules for this kind of situation in a more tool-friendly manner.

for example, one suggested above such as import foo using "./foo.py" where . always refer to current file, I think, is a good example of tool-friendly import construct.