PEP 723: Embedding pyproject.toml in single-file scripts (final iteration)

ofek · August 23, 2023, 12:50pm

This is the final version switching to the [run] table approach rather than [project]: PEP 723 – Inline script metadata | peps.python.org

brettcannon · August 23, 2023, 11:19pm

Thanks for the update!

I was wondering about the motivation behind run.version. Did you have specific plans for it in Hatch? I read through the PEP and its justification and I’m not quite seeing why it’s important to have in this initial version?

ofek · August 24, 2023, 2:25am

I don’t have much to add other than this section which you have read. I am hesitant to take it out because this is from what I can tell every field that would be relevant to both scripts and directory-type projects, and I would be annoyed if I had to make another PEP or if the PEP for including the table in pyproject.toml files added the field and expanded the script allowance in the same PEP.

brettcannon · August 24, 2023, 7:15pm

I think what’s tripping my brain up is why this has to be standardized as metadata? People use __version__ for this sort of thing and I haven’t seen people calling for this sort of change for code that isn’t installed. I also don’t see how script runners would use the version metadata for anything, e.g. I don’t see how it could benefit caching since you can’t assume someone isn’t actively editing their dependencies between version bumps.

So I’m -0 on the idea unless there’s a bit more motivation behind it being a “nice to have” just because some people like to write down some version number somewhere and so lets just make that a standard. Otherwise this starts down the slippery slope of standardizing any bit of metadata people might want to write down like author name, etc. And at that point we are going back to [project] and deciding we would rather tweak its purpose and required keys.

ofek · August 25, 2023, 4:47am

The version field has been removed.

brettcannon · August 25, 2023, 5:13pm

To be clear, if other people come forward and say they have a use case I’m totally happy to be supportive.

ofek · August 25, 2023, 9:35pm

@brettcannon I view this PEP as finalized; there will be no further changes.

pomponchik · August 31, 2023, 5:17pm

The idea behind this PEP is very similar to a small tool I created: instld.

My tool allows you to run scripts with the instld command, like this:

instld script.py

If there are third-party library imports inside the script, the libraries are installed automatically. The library name is taken from the name of the imported module. For example, this import will lead to the installation of the “polog” module:

import polog

There are situations when the name of the imported module and the package name are different. In this case, the package name can be specified using a special comment language, here is an example:

import f # instld: version 0.0.3, package fazy

Please note that my implementation does not require the user to have any additional knowledge about venv, pip or PATH. The fact is that all packages for a specific launch are installed in a temporary directory, which is automatically destroyed after the script stops working. By default, the user does not need to think about anything at all, the technology just works.

My opinion about this initiative: this problem is completely solved with the help of my library and does not require improvements directly in CPython.

merwok · August 31, 2023, 5:31pm

I think you’re misunderstanding. PEP 722 or 723 do not change CPython, but they standardize the information that many tools like yours can read.

ofek · August 31, 2023, 5:32pm

The mechanism of which you speak has its own section as a rejected idea courtesy of @jeanas: Why not infer the requirements from import statements?

sinoroc · August 31, 2023, 5:47pm

@pomponchik, for info: instld seems very similar to @facundo’s fades.

pomponchik · August 31, 2023, 8:21pm

Oh, yes, I misread the purpose of this PEP, sorry.

courtneywebster · September 27, 2023, 10:48pm

I have posted the summary of the user studies in the PEP 722 and PEP 723 User Study Discussion thread. I am happy to provide clarity on anything covered there and can answer any questions that come up as a result of the study!

epage · October 3, 2023, 9:00pm

Unsure which of the 3+ threads this is most appropriate to share on but figured I’d give an update on my parallel effort to these for Rust.

After discussions with educators/trainers, the Cargo team (of which I’m a member), and the language team, we have an RFC for what syntax to use to embed manifests. This is still a draft but the pros/cons and the general approach I think are fairly close to what we’ll be doing and could provide some insight here.

For some context, decisions in the Rust project are made by domain-specific teams. We have a separate RFC that the Cargo team will be making a decision on. For the syntax, I had planned for that to be a language team decision from the beginning, even if no language changes were needed (e.g. using comments) so they could own the decision to not extend the language.

BrenBarn · October 4, 2023, 4:14am

That’s certainly intriguing. So in Rust the decision was (or potentially will be, as this is a draft) to actually change the language syntax so that no comment or docstring or similar pre-existing “carrier” is needed. Personally I wouldn’t be in favor of doing that in Python, but it’s an interesting choice.

ofek · October 5, 2023, 3:44am

It’s possible that TOML is the right format and a docstring is the right enclosure. That has always been my preference but I felt it would basically be rejected based on offline discussions.

No leading characters is a massive boon to UX, as seen in the final Rust proposal.

pitrou · October 5, 2023, 6:46am

Pity. Now we have yet another bespoke format (commented block + # /// markers) that doesn’t seem to have any well-known precedent in the Python ecosystem, unless I’m missing something. Commented blocks are generally annoying to edit.

I’ve also read PEP 723 – Inline script metadata | peps.python.org and I must say the rationale evades me a bit. How often is character escaping needed inside a pyproject file? Why is it difficult to embed a double-quoted string inside a single-quoted string, or the reverse?

sirosen · October 5, 2023, 2:21pm

You’ll find a different set of arguments on this front articulated in this section of PEP 722:

(A while back the authors agreed that this isn’t really the “parent PEP”, so I guess it’s a “sibling”?)

I find that write up more compelling. The main thing is that I am readily convinced that users would try to build the string at runtime. In the 300+ comment thread for 722 (i.e. it’s unreasonable to expect anyone to read all of it), others were either suspicious that this would happen in practice, or of the opinion that we shouldn’t care since it violates the spec.

One could argue that users are less likely to try funny business at runtime with the TOML format than the 722 format. I’m… not sure. I’ll think about that more.

As for the part of 722 which describes the requirement for a parser, whether or not that’s convincing depends on whether or not you believe that the PEP must define an unambiguous parse even in the presence of very strange things.
I’ll offer this example as the sort of thing I think we should worry about:

if sys.version_info < (3, 8):
    __pyproject__ = """\
[run]
dependencies = ["typing-extensions"]
"""

This “obviously” shouldn’t be accepted. Consider also the variant with inline if-else.

I think it would be a nice improvement to 723 to include some version of these considerations.

pitrou · October 5, 2023, 2:33pm

Ok, that makes much more sense then. Thank you!

ofek · October 5, 2023, 4:35pm

The versions of this PEP that proposed docstrings/multi-line strings used the regular expression as the canonical specification and therefore that would not have been allowed.