How to specify dependencies: PEP 508 strings or a table in TOML?

Yes, the reason why I started Poetry was to build a single tool to manage Python projects. However, some of its features are only possible thanks to its own metadata format, that are not restricted by the limitation of the existing specifications and tools. One example that comes to mind is the ability to specify a source of installation for a specific package via the source keyword:

[tool.poetry.dependencies]
requests = {"version" = "^2.24.0", source = "my-custom-index"}

[[tool.poetry.source]]
name = "my-custom-index"
url = "https://example.com/simple/"
secondary = true

So even with PEP 633 there would be a lot of work and the need to remove or completely redesign features for Poetry to support it. However PEP 633 would ease this transition because Poetry already is able to programmatically manipulate pyproject.toml file which is harder with PEP 631 because dependencies are just one big string without concepts separation.

2 Likes

There was some discussion on this, but not enough to come to a quick conclusion, so it was deferred.

We could add a key to the table for each environment marker, and they are essentially ANDed together (and also ANDed with the free-form markers key, if also specified). See the PR (or the prefixed version)

No matter how dependencies are provided by the user, as long as the package uses PEP 566 core metadata, PEP 508 will be required for Requires-Dist fields.

Would extending PEP 508 be possible? I was actually thinking of opening a thread anyway to ask about the viability of extending the grammar to support local paths/editable installations.

Just added: PEP 631 – Dependency specification in pyproject.toml based on PEP 508 | peps.python.org

I was not criticizing the decision on markers made in the PEP. I think it would probably be worse and much harder to understand if markers and individual keys were both allowed, but I don’t think it’s possible to fully express PEP 508’s markers as a list of key-value pairs.

The fact that this problem cannot be solved better than the way it is solved is evidence that the claims that the “exploded table” approach is better because it is using TOML syntax are not as compelling as they may seem. You will always need a DSL for a lot of this stuff and only a relatively small part of that DSL can be translated into key-value pairs. That makes the value proposition a lot lower when we already have a compact, standardized DSL in wide use and suitable for use in a wide variety of contexts.

Yes, I agree, but part of the benefits that were purported in the original thread for the exploded table format was that it would be easy to convert it to structured data and to validate it; in the Motivation Section of PEP 633:

Each of these benefits (save the “secondary validation”, which doesn’t seem amazingly specific to an exploded view — I suspect anything that can validate the markers = tag should be able to validate PEP 508’s rather simple grammar) only applies to the parts of PEP 633 where the DSL is encoded using TOML syntax. The version spec and markers (two of the more important parts of PEP 508) are no easier in TOML than in PEP 508.

One thing to note when comparing PEP 633 and PEP 631: the reference implementations (in the current version) do not do the same thing. PEP 633 assumes that you’ve validated the table already, then goes about converting it into a list of extras and a list of PEP 508 strings (which you are then expected to use to emit a METADATA file). PEP 631’s reference implementation consists almost entirely of validating the PEP 508 strings, then emitting a METADATA file. The equivalent of PEP 633’s implementation is something like this (no equivalents are necessary for the first two functions that PEP 633 provides, since the input format is already PEP 508):

def convert_project_requirements_to_pep508(
    project: Dict[Any, Any]
) -> Tuple[List[str], List[str]]:
    reqs = list(project.get("dependencies"))

    optional_dependencies = project.get("optional-dependencies", {})
    extras = list(optional_dependencies.keys())
    for extra in extras:
        for dep in parse_dependencies(optional_dependencies[extra]):
            dep += (" and" if ";" in dep else "; ") + f"extra == {extra}"
            reqs.append(dep)

    return reqs, extras
1 Like

I personally can’t stand .ini files and we have already bought into TOML with pyproject.toml (blame me if you want), so this isn’t really an option unless you want to write a competing PEP to PEP 621.

Well, one PEP is practically "copy these string into METADATA and the other is a mechanical translation of a table format to PEP 508 strings. Everything else is a PEP 621 concern (and you could even argue the resulting translation will be as well).

Well, “tools will need” to probably use some library like ‘packaging’ or something else which will do all of this for you regardless of which PEP wins. I will personally write the generate_metadata(toml_data) function if necessary for PEP 621.

I would strongly advise against that (at least for now). Specifying yet another file format is not in anyone’s best interest while trying to get this whole endeavour settled (and that includes saying, “split on newlines” because if people think it’s like requirements.txt then you have to worry about escaped newlines, etc.).

I will reject my own PEP if it looks like that’s going to be how people choose to view this endeavour. Tools can choose to model how they specify metadata on it (as Paul G. points out), but otherwise it won’t be a standard and thus shouldn’t be accepted as a PEP.

1 Like

I’d like to say that be that so, we do have an ini parser and writer in the standard library, but not one for toml. This makes it painful for packaging tools, as now they must provision a 3rd party library for any kind of trivial interaction with packaging.

As we make more use of TOML, I agree this becomes more of an issue. But in the specific case of PEP 621, the only tools that will be reading the new data are build backends, which already need to have a TOML parser to read pyproject.toml.

PEP 621 is explicitly noted as not being intended for people to introspect project metadata, so if you’re using it and not writing a backend, you’re probably doing something wrong¹.

What I am a bit more concerned about is that if a PEP 621 parser gets added to packaging, then packaging will gain a dependency on a TOML parser, and as a result, we’ll end up “blessing” one particular TOML library, as a consequence of the fact that you probably need packaging, and depending on two TOML libraries is silly. That’s a good argument for getting TOML parsing in the stdlib, IMO.

Edit: Re-reading this, I realise that there’s an irony here (which comes up quite a lot). People developing packaging tools often in my experience dislike adding extra dependencies. We make all sorts of justifications (chicken-and-egg arguments, isolation, whatever) but at the end of the day, I find that packaging tool developers pretty consistently hate having to deal with packaging their own tools. That says something fairly fundamental, and not very positive, about the state of packaging² in Python :frowning: But that’s a separate debate, so I won’t continue that thought any further here.

¹ I wish this wasn’t the case, and we’d agreed to let PEP 621 data be treated as canonical, but that’s not the reality. Also, I expect people will use PEP 621 data simply because sdist metadata isn’t yet standardised, but we shouldn’t optimise for a discouraged use case…
² Specifically, packaging applications.

3 Likes

Okay, now I see the benefit of a table more clearly!

Once TOML reaches 1.0 then this will start a discussion about adding a TOML parser to the stdlib. But do understand it will also very likely lead to a discussion about what the future of the stdlib is, so it will not happen quickly.

IOW this is off-topic. :wink:

I’m not expecting that. Much more likely is an API that takes in a dict-like object that a TOML parser produces and process data that way. There’s no need to do the I/O in ‘packaging’ when e.g. tomlkit exists for specific needs for some people while a read-only TOML parser fits the needs of others.

But once again, off-topic for this discussion.

Today is the end of the month and this topic has been open for just shy of two weeks. If anyone has anything to say, now is the time to say it. Otherwise I believe it will be time to make a decision (probably by @pf_moore unless someone else wants to be PEP delegate for these two PEPs?).

3 Likes

A question: for packaging topics, does the Python Steering Council still makes the decision (by potentially delegating it to @pf_moore) or would it make sense to make the decision in form of vote as laid out in https://www.python.org/dev/peps/pep-0609?

Packaging PEPs are still handled by the SC/PEP delegate. A goal of PEP 609 is to document that process, but there’s no (immediate) intention to change it.

In the case of these PEPs, I expect them to be formally submitted for a decision, at which point I’ll review and publish my decision. The SC has veto over that, should they wish to intervene, but they don’t normally do so.

1 Like

What do I need to do?

Just say it’s ready for pronouncement - nothing fancy :slightly_smiling_face: I’m happy to take the above comment as confirmation that PEP 631 is ready, I just need @EpicWink or @abn to confirm PEP 633 is ready, as I want to review the two together.

1 Like

There are a couple of minor inconsistencies in the examples and a typo or two that I’ve just spotted, but the PEP (and especially the specification) is ready.

1 Like

Pronouncement on PEPs 631 and 633

After much deliberation, I’m happy to announce that I am approving PEP 631 - “Dependency specification in pyproject.toml based on PEP 508”.

As a consequence, PEP 633 - “Dependency specification in pyproject.toml using an exploded TOML table” is rejected.

Thanks to the authors of both PEPs for all the work they put in, and to the many participants in the various discussions. This has not been an easy decision to make, and I don’t imagine it will be to everyone’s liking, but I hope that now the question has been resolved, we will all be able to work towards finalising PEP 621 and moving forward from there.

Clarifications on the wording of PEP 631

PEP 631 includes some rather confusing wording, in that it says values must be an “array of strings or inline tables”, but then goes on to say that inline tables are for future expansion, and backends MUST error if they encounter inline tables “unless the specification is extended”. I consider this a pointless distinction, as making inline tables allowed can be done when (if!) the specification gets extended, so there’s no need to make an explicit provision now.

As a result, I interpret the spec as effectively saying that only an array of strings is allowed, and I would encourage @ofek (as PEP author) to clean up that part of the PEP to make that explicit. Doing so will not affect the decision on the PEP, which is based on my reading of the practical implication of the text.

(Note that the inline table provision is left over from a rejected option to have a link to a file of requirements, and was never intended to allow for any form of “exploded table” format).

Reasons for the decision

Surprisingly, I did not in the end find that “readability” was a significant factor in the decision. Both approaches looked “OK” for straightforward uses, and both got messy for more complex cases. The level of punctuation needed is down to the fact that PEP 621 (and pyproject.toml as a whole) uses TOML syntax, and that applies equally to both proposals.

Unfortunately, neither PEP had particularly compelling arguments motivating their proposal - PEP 631 was particularly disappointing here as it lacked any “Motivation” section at all. As a result, I relied heavily on the overall “tone” of the discussions, along with my own views, to infer motivating arguments.

In addition, PEP 633, as the more complex proposal, had some flaws which concerned me (for example the translation rule noted here is not covered in the PEP - maybe the intention was to disallow the git@github.com:aio-libs/aiohttp.git form, but that was never stated in the discussion or in the PEP). I’m sure they could all be addressed, but the point of submitting a PEP for approval is that in the authors’ view, it has covered all of the details, so I did consider that a weakness of PEP 633.

The key things that ultimately affected the decision were consistency and compatibility.

In terms of consistency, as an existing standard, PEP 508 strings are used in a lot of places throughout the Python packaging ecosystem - package metadata as stored in METADATA files and reported by importlib.metadata, requirements files, installer command line arguments, tox configuration, etc. PEP 633 doesn’t propose any change for those use cases (nor should it, those areas are out of scope), but as a consequence, under that proposal users would need to deal with both formats. That’s not a showstopper, but I believe that it’s easier for users if we stick to a single format.

As far as compatibility is concerned, setuptools and flit already use PEP 508 strings for dependency specification. For users of those tools, moving to PEP 631 simply involves copying the existing value to a new location, whereas PEP 633 involves learning a new format and translating. Again, that’s not a major issue, but it is a hurdle for users, and in a world where Python packaging gets negative feedback for the level of change that’s going on, I’d prefer to avoid it if possible. Poetry users already use a TOML format, but it’s not the same as PEP 633, so they’d have to change either way.

Overall, “status quo wins”, plus PEP 631 being a significantly simpler proposal, which fits more closely to existing tools and with fewer rough edges, was the deciding factor.

What’s next for PEP 621?

I’d like to see PEP 621 updated to include the proposed form from PEP 631 directly, rather than including PEP 631 “by reference”. I don’t think there is sufficient complexity to warrant keeping the dependency syntax in a separate document.

With the question of dependencies settled, I think the biggest remaining task is that PEP 621 needs to strengthen its statement of what benefits it provides. If all it offers is a common user interface for backends to use, then I think that making it a mandatory standard is probably too strong. What I’d like to see (and what, in my view, would put PEP 621 firmly into the area of being an “interoperability standard”) would be if we could allow consumers to treat metadata read from pyproject.toml as canonical. That would extend the benefits significantly beyond “people who want to be able to change backend without rewriting the metadata definition”. It would also remove a significant awkwardness in the current PEP, where the data is present, but we’re telling people not to use it unless they are writing a backend…

With a sufficiently strong set of benefits, I think PEP 621 will then be pretty much ready for submission (although ultimately that’s for @brettcannon to decide).

11 Likes

@ofek this is essentially what I had in mind to clarify PEP 631. Feel free to use it if you want to.

1 Like