Addendum for PEP 722 to use TOML

Here is the first draft: PEP 723: Embedding metadata in single-file scripts by ofek · Pull Request #3264 · python/peps · GitHub

15 Likes

Should we open a new discussion thread for feedback on this second proposal?

If not, please find my comments below.


Any Python script may assign a variable named __pyproject__ to a multi-line double-quoted string containing a valid TOML document

This regular expression may be used to parse the metadata:
(?ms)^__pyproject__ *= *"""$(.+?)^"""$

I think it worth to emphasize the double-quoted part on this definition or even explicitly say that single-quotes are forbidden[1]

  • The name and version fields MUST be defined dynamically by tools if the user does not define them

I believe that name and version are not relevant in the scenario in which the user just wants to run a script (this is different from a packaging perspective). I don’t think it makes sense to force tools to define them automatically. There might be simply no use for them to define it (I think this is OK as an implementation detail for an specific tool, but not all the tools have to comply with that).

[project] table

The [project] table is designed with the packaging use-case in mind[2], and therefore might expose information that is not relevant in the case of “running a single-file script”. Moreover, the type of information necessary to “run a single-file script” and the type of information necessary for packaging[2:1] might diverge (even more) with time. Right now we already see divergence in the PEP draft: the PEP makes a few considerations to “special case” the [project] table for the “running single-file script” use case (like adding exceptions for dynamic behaviour)[3].

If the idea is to re-use the pyproject.toml spec and the TOML language, it would be better to follow whatever comes up from the discussion on Projects that aren't meant to generate a wheel and `pyproject.toml`, instead of necessarily the [project] table. For example, if the outcome of the other discussion is to use a [run] table, then maybe this PEP should use a [run] table too.

Benefits

The PEP has a benefits section that kind of presents a comparison (maybe not too explicit) with the other PEP. Would it make sense to have a pros/cons section? The draft list some pros, and I can see a few cons that could make sense to make explicit in the PEP:

  • The use of TOML language makes it more difficult to parse the definition block and may require 3rd-part dependencies. While this is not a problem for Python >= 3.11 (thanks to tomllib), it might impose implementation difficulties for tools implemented in other versions of Python or other languages.
  • Users may expect other tools to interpret embedded pyproject.toml for functionalities not related to “running a single-file script”. They might be surprised when the expectations are not met.
  • Since dunder attributes have (usually) special meaning in Python, users might expect that this is handled by Python itself and not a third party tool. They might be surprised when the expectations are not met.

(probably there are more cons listed in this thread)


  1. The provided reference implementation becomes incompatible if single quotes are allowed. ↩︎

  2. Here I use the term “packaging” to express “the process of bundling files together to create a python distribution archive (sdist or wheel) file”. ↩︎ ↩︎

  3. Also it does not specify what tools are expected to do if they find keys like optional-dependencies or entry-points. ↩︎

5 Likes

Quoting the comment on the other thread for completeness:

2 Likes

Thanks. Adding some further comments that I made on the PEP draft, for extra visibility.

  1. (A minor point) this PEP refers to PEP 722 as the “parent PEP”. I don’t think that’s accurate, it’s a competing PEP and as such should stand alone. The term “parent” suggests that this PEP inherits context from PEP 722, and I think it’s quite the opposite, it contradicts a lot of what PEP 722 says[1]
  2. The PEP mentions in passing “a user who appears to already be using this PEP’s format
    for its intended purpose”. If there’s prior art for using this format, can we explicitly reference it and use it as an example? It would be much more compelling that way.
  3. As @abravalheri mentioned, and expanding on it, the PEP should be clear whether the regex in the “reference implementation” section is intended to be definitive. If it isn’t, then that needs to be acknowledged (and the text specification needs to be clearer). If it is, then the text specification is both too vague, and basically inaccurate.
  4. The “How to teach” section needs work. It refers to PEP 722, which is odd as the corresponding section there is explicitly about the PEP 722 format. It also ignores the (IMO, most common) case of users who write scripts, but have no prior knowledge of Python packaging, and likely have no need to know about it or interest in it.

Sorry for the duplication, but when I wrote the review comments, this thread didn’t exist.


  1. rebellious teenager? :slightly_smiling_face: ↩︎

4 Likes

Before I address all the comments I would like to clarify this:

Did I misinterpret what Brett said and I should actually start the PEP from scratch?

Any Python script may assign a variable named __pyproject__ to a multi-line double-quoted string containing a valid TOML document.

This regular expression may be used to parse the metadata:
(?ms)^__pyproject__ *= *"""$(.+?)^"""$

The following is an example of how to read the metadata on Python 3.11 or higher.

import re, tomllib

def read(script: str) -> dict | None:
   match = re.search(r'(?ms)^__pyproject__ *= *"""$(.+?)^"""$', script)
   return tomllib.loads(match.group(1)) if match else None

Not sure if this is relevant or not for the discussion, but the suggested regex does not seem to cover all strings that are allowed by the current state of the PEP. For example if I run the following file with python3.11 myfile.py:

# myfile.py

__pyproject__ = """
[project]
readme.text = \"""
Hello, this is my awesome single-file utility script.
Try running::

    pipx run myfile.py
\"""
dependencies = ['tomli; python_version < "3.11"']
"""

import re, sys
if sys.version_info >= (3, 11):
    import tomllib
else:
    import tomli as tomlib


def read(script: str) -> dict | None:
    match = re.search(r'(?ms)^__pyproject__ *= *"""$(.+?)^"""$', script)
    return tomllib.loads(match.group(1)) if match else None


if __name__ == '__main__':
    with open(__file__, 'r', encoding="utf-8") as fp:
        script = fp.read()
    match = re.search(r'(?ms)^__pyproject__ *= *"""$(.+?)^"""$', script)
    print("---\nmatch.group:\n", match.group(1), "\n---\n")
    print("---\npyproject:\n", tomllib.loads(__pyproject__), "\n---\n")
    print(read(script))

I can get a parsing error:

$ python3.11 myfile.py
---
match.group:

[project]
readme.text = \"""
Hello, this is my awesome single-file utility script.
Try running::

    pipx run myfile.py
\"""
dependencies = ['tomli; python_version < "3.11"']

---

---
pyproject:
 {'project': {'readme': {'text': 'Hello, this is my awesome single-file utility script.\nTry running::\n\n    pipx run myfile.py\n'}, 'dependencies': ['tomli; python_version < "3.11"']}}
---

Traceback (most recent call last):
  File "/tmp/myapp/myfile.py", line 32, in <module>
    print(read(script))
          ^^^^^^^^^^^^
  File "/tmp/myapp/myfile.py", line 23, in read
    return tomllib.loads(match.group(1)) if match else None
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/tomllib/_parser.py", line 102, in loads
    pos = key_value_rule(src, pos, out, header, parse_float)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/tomllib/_parser.py", line 326, in key_value_rule
    pos, key, value = parse_key_value_pair(src, pos, parse_float)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/tomllib/_parser.py", line 369, in parse_key_value_pair
    pos, value = parse_value(src, pos, parse_float)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/tomllib/_parser.py", line 649, in parse_value
    raise suffixed_err(src, pos, "Invalid value")
tomllib.TOMLDecodeError: Invalid value (at line 3, column 15)

As far as I understood, myfile.py seems to be following the specification in the current state of the PEP draft.
The situation might be related to the fact that for finding the string literal assigned to __pyproject__ variable tools need to have some level of understanding of Python syntax (at minimum, how multi-line double-quoted strings can be written/escaped).

It might be worth to modify the PEP to restrict the allowed syntax and therefore rule out such edge cases. Or at least make it more explicit if the current PEP text already covers that scenario (it might be the case the text already covers this edge case, and I am just doing a very bad job at interpreting it - but I guess other people might also have problems with that).

1 Like

I can’t speak for Brett, but personally, I’m not particularly happy if PEP 723 essentially says “like PEP 722, but with these differences” while proposing something that PEP 722 has explicitly rejected (in a lot of detail).

You’re welcome to take sections of text from PEP 722 and copy them into PEP 723, and it’s fine to say “PEP 722 does the following, this PEP proposes a different approach because (insert explanation here)”, but I think it’s unfair to presume that anything in PEP 722 can be taken without qualification to be in support of the proposal in PEP 723.

I assume this goes back to my comment

This was very specifically talking about “using TOML in place of a custom format” - not about the choices of parsing the code as text (i.e., using a “structured comment” approach), or not supporting the full pyproject.toml format, or any of the other choices made in PEP 722. Maybe that wasn’t clear enough, my apologies if so.

Obviously, people can propose any PEP that they want. But as the author of PEP 722, I object to it being used in support of a PEP that so drastically rejects the choices behind my proposal. By all means write such a PEP - having a concrete specification makes discussion of the details of the proposal much easier (after all, that’s been an ongoing problem with the “why not use pyproject.toml?” position[1], that it’s not clear what is meant) but suggesting that your proposal builds on mine, rather than replacing it, is frankly just misleading and incorrect.

Personally, I would much rather you started your PEP from scratch. It would save me a lot of time and effort pointing out things I’ve already stated, repeatedly, when arguing for the choices I made in PEP 722. I suspect it would also save a lot of your time, as you’d be spared having to go through all of the debates we’ve already had - and you’d still need to reflect the discussion in your PEP anyway, so you don’t save anything in the long run.

If I may offer one piece of advice, it’s that whenever we’ve had competing PEPs in the past, ones that have positioned themselves as “an alternative proposal to X” rather than as a complete proposal in their own right, have tended to fail - often over technical issues due to being underspecified or making unwarranted assumptions, rather than because the underlying ideas are bad.

But at the end of the day, do whatever you prefer - it will be up to Brett to decide.


  1. and one I spend a chunk of time addressing in PEP 722 ↩︎

4 Likes

Thanks for writing this PEP. It is certainly an interesting idea worth careful consideration, IMO potentially fully independently from PEP 722.

I think this PEP (723) would benefit from a more in-depth motivation and analysis of potential use cases. I suspect it could have much less overlap with PEP 722 than so far posited, and could see it being very appealing and useful, but mainly to people with a quite different background / in quite different scenarios than PEP 722. But without explicitly going into them it seems like the PEP misses a chance of being really convincing. To me it so far just seems plausible that there might be a use, but none is really presented.

The PEP mentions Go and Rust as inspiration, and it mentions dependencies, Python version and script version (which I don’t fully understand) as use cases. Could that be explicitly expanded, e.g. analyzing the Go and Rust ecosystems more? Are there reports, examples, statistics, experience of what does and doesn’t work or get used there?

I can vaguely imagine PEP 723 being useful for something, but maybe lack the background with Hatch or something else to see what it is. I would recommend writing an in-depth standalone motivation explicitly drawing from (Hatch?) background and experience that not everyone shares. I know many people (including me) would love a future where an all-in-one tool (like Hatch?) is bundled with Python as the default tool for everything, so if this PEP is a step into that future, I’d be very interested that it gets fair consideration, and full explanation of the vision.

For the specific use case PEP 722 carefully explained in detail and that is very familiar and useful to my experience, PEP 722 seems vastly preferable.

[FYI I have not read your PEP yet as you haven’t said you’re ready for me to.]

It’s that last sentence from Paul that I was trying to convey. For instance, the motivation behind wanting to make single-file scripts work for execution doesn’t need to be new and unique in both PEPs. But you will have to explain why TOML, why all of pyproject.toml (or whatever subset you’re suggesting), why the format for the embedding of the TOML, etc. All of that differs from Paul’s approach, and so you will have to explain all of that on your own (and contrasting with PEP 722 never hurts). It’s more than just a diff on the technical details since you have some fundamental differences in what you’re suggesting compared to PEP 722.

Think of a PEP as recording the conversation around what you are proposing to happen. For the parts that PEPs 722 and 723 agree with, you can basically say, “we agree and this part of the conversation is recorded in PEP 722” (and thanks to Paul for writing first :slightly_smiling_face:). But anything where you don’t agree, you need so say, “we disagree, here’s how, and here’s why I think my proposal this is the better solution” because it goes beyond just syntax.

2 Likes

From the PEP722 thread:

I think at least part of the reason we’re also considering PEP 723, and why the conversation got to this point, is because a lot of people imagine a different use case. I think that use case looks more like “building multiple binaries from the same directory”, which in turn is much more heavily dependent on the discussion in Projects that aren’t meant to generate a wheel and pyproject.toml.

But the PEP 722 use case doesn’t involve “building” at all, in any meaningful sense. If using a single-file script ever involves needing to know anything more than a) what third-party libraries are needed and b) what Python version is needed, at that point it’s hard to fathom that a tool calling itself a “script-runner” can be helpful.

It’s all well and good to challenge the notion of what a “project” is, and the utility of forcing separate projects into separate folders with specific structures, etc. etc. But if there’s supposed to be some underlying goal of helping to “transition” projects from something that can stand alone in one file to something that requires any kind of “building”, then I think this approach, such as it is so far, fails.

I’m currently +1 on PEP 722 and -1 on this. This is my refutation of the “benefits” section of PEP 723:

Ecosystem confusion

I reserve my objection that configuring a single-file script so that a script runner knows what to do with it, fundamentally, is not packaging. The distribution step in these cases typically appears to look more like attaching a script to an email sent privately, or using a third-party file-sharing service that isn’t necessarily specific to code.
The original intent in PEP 722 was expressly not to facilitate or automate such distribution. The Rationale section starts out: “Because a key requirement is writing single-file scripts, and simple sharing by giving someone a copy of the script”. PEP 723 doesn’t include its own Rationale section, so I can only assume this is still in effect.

Extensibility

I strongly suspect that the two points mentioned here are already an exhaustive list as far as script-runners are concerned. In fact, even including a version number here is not helpful because, again, there is no intent to facilitate or automate distribution.
So, let’s consider those two points. I don’t think even they are significant. First off, the script version. If I write my script to depend on my coworker’s script, and I don’t have the right version, a script runner cannot in principle help me with this.
In this hypothetical, the work flow currently doesn’t involve any automation, isn’t designed for automation, and needs to be completely reworked before automation is possible within the existing ecosystem. My friend and I would both at minimum have to learn how to use a package index, a build tool and an install tool; and the organization might insist that we could only do this with a private package index. Keep in mind, previously we didn’t have to know any of this stuff about the packaging ecosystem (at least, not as long as the script runner could talk to PyPI for us) - more evidence that what we were doing was not packaging.
So that leaves us with only the required Python version. First off, if I am writing the scripts for myself, then I already know what version of Python is needed. If I am sharing scripts within an organization, there’s a good chance that the organization is updating everyone’s system’s Python version in lockstep, if it ever changes at all. I might even be forbidden from installing a separate version privately. But supposing there were a use case for this - if running my script depends on the available version of Python meeting some criterion, then that Python version is a dependency! It just isn’t one that prior PEPs decided should go in the same TOML section as third-party dependencies.

Broader applicability

I can’t imagine anyone would want TOML-specific syntax highlighting to show up within the magic embedded comment that explains the project dependencies. This section is supposed to be about what the dependencies are, not about the demands of TOML syntax. If anything, I would want syntax highlighting that is specific to the syntax of requirements specifiers.
As for package managers, they don’t “gain the ability to run scripts” magically. They still need to implement a run tool. To do so, they would need to regex the script, pass some data to the TOML parser, extract the keys that represent dependencies, and then install them. Using TOML avoids the need to parse a new format, but it comes at the cost of filtering through a data structure, the extensibility of which is still questionably motivated. The PEP722 format is trivial to parse and doesn’t specify anything extra.
Similarly, version checkers don’t to my knowledge already parse pyproject.toml. Having dependency information directly in the file is advantageous for them because of IDE integration, but if the project does grow to have an actual pyproject.toml then… well, that’s a whole other kettle of fish.

Aside from all of that, I’m concerned about the temptation for tools to try to maintain the “embedded TOML”. At a minimum, that will make commit logs look messier.

Assuming that the role of pyproject.toml gets rethought - that we take seriously the idea of project files not intended to build a wheel, and especially consider the idea of using the TOML files as a config format that can have more than one in the same directory - then I absolutely do see value in being able to create such TOML based on something embedded in a single-file script.

But I don’t see, any more, benefit in the embedded text looking like TOML. I still think that the way to go is to have a single tool that creates a skeleton TOML file from the embedded file contents. I think it’s probably about as easy either way. I think that’s much more considerate of the people for whom PEP 722 is intended. And I’m still volunteering to write and maintain it.

2 Likes

Thanks for writing up this proposal!

I like it more than the PEP 722 syntax:

  • I find PEP 723 harder to mess up than PEP 722; e.g. I can easily imagine forgetting to capitalise “Dependencies” or forgetting to double the #
  • I like that it mirrors pyproject.toml and you can copy paste between the two
  • The use of __pyproject__ makes it easier to google (PEP 723 is already the first hit for me, while googling “python script dependencies” is not helpful)
  • Using a markup language makes it more obvious how to spell future extensions. For instance, I could imagine this feature being extended to allow “local” dependencies, adding entrypoints, tool specific options that dictate how the environment should be managed, other tool and file specific configuration (similar to how people love pyproject.toml for this), etc.

I did not find the “Why not TOML” part of PEP 722 convincing:

  • I find TOML very human readable and friendly to people who’ve used any Python! The syntax needed here is basically just assignment of list to variable.
  • I fully expect people to be able to write (or copy paste) this without even realising that it’s TOML.
  • “There are many ways of writing a simple list of strings in it, and it will not be clear to inexperienced users which form to use” ← I just don’t find this convincing. This argument would apply to list literals in the Python language too! The beginners I’ve helped aren’t usually particularly confused by list syntax.
  • I think the point about not really needing flexibility is correct. But a lot of the argument for TOML in my mind is for consistency; the double # and capitalisations and un-googleableness just feels so ad hoc.

(edit: I haven’t kept up with all the PEP 723 changes after I posted this comment)

11 Likes

Thanks for writing a proposal. IMHO it is missing in motivation and explanation of benefits. Apart from a reference to other programming languages, it does not give reasons why embedding pyproject.toml in a string is better than a simple list of dependencies. I have read the discussions of both PEPs and have not been convinced of benefits of the TOML format here, or full pyproject.toml format.

A list of PEP 508 requirements in a comment block with a special first line is straightforward and useful.
This is the crux of the thing to me. PEP 722 is not rejecting existing pyproject.toml specs and creating an inferior competitor, it is using the existing requirements spec to improve the running of simple scripts.

Notes about smaller points:

  • I expect people to copy-paste Script Dependencies from examples, so I don’t share the fear of people missing capital letters. (The more special the comment line can look, the better!)
  • Requirements are already a microformat (with special characters for comparison, environment markers, etc), adding more quotes and commas to make it TOML makes it harder to write for people.
  • I am not convinced by arguments of existing tooling (like pip-compile or dependabot) being able to process or write them, as these are not full-blown Python projects (as some TOML proponents seem to want) but, again, simple scripts with one new special thing for script runners (as PEP 722 defines clearly).
6 Likes

I updated the PEP so that it stands on its own as requested. I welcome all feedback! PEP 723 – Embedding pyproject metadata in single-file scripts | peps.python.org

edit: FYI I copied and lightly touched up the “why not” sections that Paul and I shared.

4 Likes

Would it be possible to add “using TOML” or “pyproject.toml” to the PEP title for clarity?

For the average person automating a task or the data scientist, they are already starting with zero context and are unlikely to be familiar with TOML nor requirements.txt. These users will very likely rely on snippets found online via a search engine … Searching for Python metadata formatting will lead them to the TOML-based format that already exists which they can reuse.

IME it is much more common to find a command like pip install xxx.
So if they use that command when writing the script, it is then most convenient to use pip freeze and copy the result to a comment block in the script.

or utilize AI in the form of a chat bot or direct code completion software. … The author tested GitHub Copilot with this PEP and it already supports auto-completion of fields and dependencies.

(Is there a reason AI would not be able to auto-complete simple comment block lines?)

The document MAY include the [tool] table and sub-tables as described in PEP 518.

So by omission other tables are like [build-system] are not allowed? A bit confusing, because later:

Why not limit to specific metadata fields?

By limiting the metadata to a specific set of fields, for example just dependencies, we would prevent legitimate use cases both known and unknown.

So any table is allowed?

Are there any other tables besides [tool], [build-system] and [project]? Maybe it would be best to explicitly list them all and mention for each field what the expected behavior in the embedded case is. What are those known use cases? Leaving unknown use cases unspecified seems surprising. Wouldn’t tools likely quickly start to implement conflicting interpretations? Also it significantly raises the effort required for tools to implement this PEP if they have to support all tables and features, no?

As it is written now, this PEP feels quite weird to read. The scope defined at the beginning does not seem to match what is actually proposed in the rest of the document. I find there are a lot of gaps. It seems to start as being a drop-in replacement for PEP 722, but then it goes all over the place by addressing some random bits of the issues with embedding a whole pyproject.toml in a Python file.

Why is PEP 621 mentioned? Instead it should mention and link the actual specification only: Declaring project metadata

The section Should scripts be able to specify a package index? seems completely unnecessary. As far as I know [project] does not allow indexes. If it is about forbiding indexes in other tables than [project] it should make it clear in the specification.

In “Recommendations”, seemingly out of nowhere, comes a bit about name, version and requires-python without much context. It should at the very least mention that it is about fields in the [project] table. Why is this recommendation here anyway? This seems quite out of the scope initially defined for this PEP, in particular this bit:

name: script-<sha256 of script's path> e.g. script-3a5c6b... to provide interoperability with other tools that use the name to derive file system storage paths for things like virtual environments

Shouldn’t it be a separate PEP? Probably once these discussions have reached a conclusion:

Good idea! I’ll go with the latter.

This is exactly what they will not be doing because of their inexperience with Python packaging and dependency management.

I was trying to say that it wouldn’t take years of Internet training for this PEP to work, it does right now.

I’ll be more explicit, thanks!

That is interesting because from my point of view it starts off not being a replacement but rather a superset of features by mentioning how scripts would want to choose the version of Python to use and the use of the word “metadata” throughout rather than just “dependencies”.

Do you have any suggestions to reduce the perceived idiosyncrasy of the text?

True, I did link both but I should only reference the latter. Thanks!

This was copied from Paul and I assume would be a question that people have about this PEP since that topic comes up a lot and we wouldn’t want the discussion to digress. If you feel the PEP as a whole is too verbose than I can remove that if you want.

I will add that useful bit of context, thanks!

No this is a recommendation that I think would never be standardized. The first link you mention I devoted an entire section to and the second is a different topic.

The abstract reads: “This PEP specifies a format for defining metadata about single-file Python scripts, such as their runtime dependencies.” Why mention the runtime dependencies here? Is “runtime dependencies” really the one good representative example of what this proposal is about? It could be mentioned in the motivation section, as it is what triggered the writing of the proposal but the proposal’s scope is far greater than that so it feels strange to have it in the abstract. I do not know if runtime dependencies is the killer-feature of this proposal.

The proposal seems to focus on the [project] section but seems very light on details about the other sections. Then there is this bit that again seems to come out of nowhere without much context or justification for this seemingly arbitrary rule:

Non-script running tools MAY choose to read from their expected [tool] sub-table if the script is the only target of the tool’s functionality. In all other cases tools MUST NOT alter behavior based on the embedded metadata. For example, if a linter is invoked with the path to a directory, it MUST behave the same as if zero files had embedded metadata.

Why this choice that the embedded configuration must not supersede? Why not let the tools decide for themselves?

Why no mention of [build-system]? I would expect a mention, especially since a linked example does contain [build-system]: https://github.com/cjolowicz/scripts/blob/31c61e7dad8d17e0070b080abee68f4f505da211/python/plot_timeseries.py

Is it the expectation of the proposal that the [project] table MUST be present in __pyproject__ ?

Some (hopefully brief) comments. I’m going to refrain as much as I can from comments that are little more than reiterating the different choices made by PEPs 722 and 723, as I think we’ve all established where we stand on those questions :slightly_smiling_face:

The document MAY include the [tool] table and sub-tables as described in PEP 518.

So [build-system] is allowed? What does it mean? PEP 723 doesn’t say, and PEP 518 makes no sense in this context.

Non-script running tools MAY choose to read from their expected [tool] sub-table if the script is the only target of the tool’s functionality. In all other cases tools MUST NOT alter behavior based on the embedded metadata. For example, if a linter is invoked with the path to a directory, it MUST behave the same as if zero files had embedded metadata.

That’s going to be pretty confusing if a user specifies, for example, some ruff configuration, and then it doesn’t apply unless they run ruff on just that one file. The PEP doesn’t address this potential confusion at all, and IMO it’s a significant flaw. In particular, the “How to teach this” section should include a discussion on how to explain the context-sensitive handling of tool config.

We argue that our choice provides easier edits for both humans and tools.

I don’t personally think that a flat assertion like this, with no real detail to back it up, is very helpful in a PEP.

This is the canonical regular expression that may be used to parse the metadata

This is still a bit coy about whether this is the definition of the format, or simply a consequence of the previous definition. We’ve had way too much experience of people reading things like this multiple ways for me to think this won’t be another case of that. Also, you don’t mention encoding, which PEP 722 explicitly noted. You should probably copy the relevant part of that PEP into yours.

For example, this script uses matplotlib and pandas to plot a timeseries. It is a good example of a script that you would see in the wild: self-contained and short.

authors = [{name = "GPT-4"}] makes me rather suspicious here :slightly_smiling_face:

I guess I’d question whether this comes under the heading of “but we want to discourage new tools”… Personally, as you can imagine, I don’t have a problem with that, but in the interests of a level playing field, I feel it’s worth mentioning.

On that page we can add a section that describes how to embed the metadata in scripts or we can have a separate page for that which links to the page describing project metadata.

IMO, “how to teach this” really means “how do we ensure people understand the difference between the two use cases and how they behave differently”. This is where I’d like to see the discussion of how to explain to people that, for example, most tools will ignore [build-system] in an embedded pyproject.toml.

For situations in which users do not define the name and version fields, the following defaults should be preferred by tools

I know this is not normative, but the “name” recommendation feels weird. For most “natural” uses, I’d expect the name to be the script filename. Generating a unique identifier for a script seems like a very different problem, and while it may be something that should be standardised, I don’t think here is the right place for it (and certainly not as a bit of non-binding advice).

Why not use (possibly restricted) Python syntax?

Did you forget to remove this? It directly argues against the syntax you’re proposing.

Should scripts be able to specify a package index?

This isn’t really needed if you’re proposing pyproject.toml, as it’s already been dealt with in that context.

Overall, this is looking better than the previous version. Not surprisingly, I still strongly disagree with the idea, and I think the weakest parts of the PEP that remain are because you make an implicit assumption that because pyproject.toml is a good fit for its existing uses, you don’t need to justify its appropriateness for this use case. So the PEP is probably convincing to people who are already convinced, but does very little to sway anyone who isn’t.

The best section is the discussion of user types in “Why not use a comment block resembling requirements.txt?”. That is a very good survey of the potential users, although I disagree with the conclusions you draw from it - I don’t think it negates the arguments against “nearly but not quite Python” syntax, nor does it answer many of my objections to TOML as a format over “plain text lines”. If your proposal was “PEP 722, but using TOML in a comment block”, then this would be a good argument. But for “use pyproject.toml-but-not-quite in a Python-but-not-quite assignment statement”, I don’t think it makes enough of a difference.

2 Likes

Thank you very much for updating the PEP ofek and for taking into consideration the feedback we are proposing here.

I noticed the following in the updated text:

  • There is an ongoing discussion about how to use pyproject.toml for projects that are not intended to be built as wheels. Although the outcome of that will likely be that the project name and version become optional in certain circumstances, this PEP considers the discussion only tangentially related.

    I understand the context where the ideas in this paragraph come from, but I think it would be better if the text of the PEP does not discuss which outcome is likely, because that can be a bit speculative.

3 Likes

This alone, might result in “headaches” as mention in the pip docs. The required or supported Python version isn’t provided in any way. Thus, there isn’t a way to know the version prior to failing to parsw the .toml file.

A try block can be used to import the parsing lib or aternatively a third-party module, but that would require it to be installed and added to requirements.txt. I like the idea, I just think the execution might be problematic.