PEP 723: Embedding pyproject.toml in single-file scripts (final iteration)

The idea behind this PEP is very similar to a small tool I created: instld.

My tool allows you to run scripts with the instld command, like this:

instld script.py

If there are third-party library imports inside the script, the libraries are installed automatically. The library name is taken from the name of the imported module. For example, this import will lead to the installation of the “polog” module:

import polog

There are situations when the name of the imported module and the package name are different. In this case, the package name can be specified using a special comment language, here is an example:

import f # instld: version 0.0.3, package fazy

Please note that my implementation does not require the user to have any additional knowledge about venv, pip or PATH. The fact is that all packages for a specific launch are installed in a temporary directory, which is automatically destroyed after the script stops working. By default, the user does not need to think about anything at all, the technology just works.

My opinion about this initiative: this problem is completely solved with the help of my library and does not require improvements directly in CPython.

1 Like

I think you’re misunderstanding. PEP 722 or 723 do not change CPython, but they standardize the information that many tools like yours can read.

1 Like

The mechanism of which you speak has its own section as a rejected idea courtesy of @jeanas: Why not infer the requirements from import statements?

1 Like

@pomponchik, for info: instld seems very similar to @facundo’s fades.

2 Likes

Oh, yes, I misread the purpose of this PEP, sorry.

I have posted the summary of the user studies in the PEP 722 and PEP 723 User Study Discussion thread. I am happy to provide clarity on anything covered there and can answer any questions that come up as a result of the study!

3 Likes

Unsure which of the 3+ threads this is most appropriate to share on but figured I’d give an update on my parallel effort to these for Rust.

After discussions with educators/trainers, the Cargo team (of which I’m a member), and the language team, we have an RFC for what syntax to use to embed manifests. This is still a draft but the pros/cons and the general approach I think are fairly close to what we’ll be doing and could provide some insight here.

For some context, decisions in the Rust project are made by domain-specific teams. We have a separate RFC that the Cargo team will be making a decision on. For the syntax, I had planned for that to be a language team decision from the beginning, even if no language changes were needed (e.g. using comments) so they could own the decision to not extend the language.

7 Likes

That’s certainly intriguing. So in Rust the decision was (or potentially will be, as this is a draft) to actually change the language syntax so that no comment or docstring or similar pre-existing “carrier” is needed. Personally I wouldn’t be in favor of doing that in Python, but it’s an interesting choice.

It’s possible that TOML is the right format and a docstring is the right enclosure. That has always been my preference but I felt it would basically be rejected based on offline discussions.

No leading characters is a massive boon to UX, as seen in the final Rust proposal.

5 Likes

Pity. Now we have yet another bespoke format (commented block + # /// markers) that doesn’t seem to have any well-known precedent in the Python ecosystem, unless I’m missing something. Commented blocks are generally annoying to edit.

I’ve also read PEP 723 – Inline script metadata | peps.python.org and I must say the rationale evades me a bit. How often is character escaping needed inside a pyproject file? Why is it difficult to embed a double-quoted string inside a single-quoted string, or the reverse?

2 Likes

You’ll find a different set of arguments on this front articulated in this section of PEP 722:

(A while back the authors agreed that this isn’t really the “parent PEP”, so I guess it’s a “sibling”?)

I find that write up more compelling. The main thing is that I am readily convinced that users would try to build the string at runtime. In the 300+ comment thread for 722 (i.e. it’s unreasonable to expect anyone to read all of it), others were either suspicious that this would happen in practice, or of the opinion that we shouldn’t care since it violates the spec.

One could argue that users are less likely to try funny business at runtime with the TOML format than the 722 format. I’m… not sure. I’ll think about that more.

As for the part of 722 which describes the requirement for a parser, whether or not that’s convincing depends on whether or not you believe that the PEP must define an unambiguous parse even in the presence of very strange things.
I’ll offer this example as the sort of thing I think we should worry about:

if sys.version_info < (3, 8):
    __pyproject__ = """\
[run]
dependencies = ["typing-extensions"]
"""

This “obviously” shouldn’t be accepted. Consider also the variant with inline if-else.


I think it would be a nice improvement to 723 to include some version of these considerations.

Ok, that makes much more sense then. Thank you!

The versions of this PEP that proposed docstrings/multi-line strings used the regular expression as the canonical specification and therefore that would not have been allowed.

You’re right; I had forgotten. The regex-based version of the spec was a solution to the “needs a parser” requirement – and I believe it is sufficient.

The remaining concerns are mostly about how users would misuse the spec, and potentially get surprising results. I think we can agree on that characterization of the issues solved by using comments rather than strings?

1 Like

FYI I plan on making a decision between PEPs 722 and 723 the week of October 16. If you want a cross-PEP thread to discuss on, see PEP 722 and PEP 723 User Study Discussion .

4 Likes

@brettcannon Am I correct in thinking that the decision is fundamentally TOML versus not and the enclosure may be changed before acceptance?

1 Like

Thank you for sharing this Ed. It’s cool to see Rust also standardizing embedding their native TOML format so that there is only one format developers need to learn.

#!/usr/bin/env cargo
```cargo
[dependencies]
clap = { version = "4.2", features = ["derive"] }
```

use clap::Parser;

#[derive(Parser, Debug)]
#[clap(version)]
struct Args {
    #[clap(short, long, help = "Path to config")]
    config: Option<std::path::PathBuf>,
}

fn main() {
    let args = Args::parse();
    println!("{:?}", args);
}
```
1 Like

This Rust syntax looks a lot like my proposal for a generic “metadata block” in Python, which I would have personally preferred for both PEPs 722 and 723: PEP 723: Embedding pyproject.toml in single-file scripts - #139 by gwerbin

Rust rules in terms of standardization. IMHO Python should follow the same way, so I much prefer syntax used in PEP 723 over PEP 722

The one thing that itches me in PEP 723, is the references to pyproject and pyproject.toml.
This seems to assume that at some point the [run] table will apply to pyproject.toml too.

In my understanding, it is not established that this will happen so it is confusing.

Would it be sufficient to mention in this PEP that run.dependencies and run.requires-python have the same meaning as in the project table of pyproject.toml so as to facilitate the conversion from one to the other?

Also /// pyproject sounds a bit awkward to me, for the same reason, but also because in my mind a script is not a project.

Sorry to be late to the game. Feel free to ignore if it is too late in the process.