How to specify dependencies: PEP 508 strings or a table in TOML?

Today is the end of the month and this topic has been open for just shy of two weeks. If anyone has anything to say, now is the time to say it. Otherwise I believe it will be time to make a decision (probably by @pf_moore unless someone else wants to be PEP delegate for these two PEPs?).

3 Likes

A question: for packaging topics, does the Python Steering Council still makes the decision (by potentially delegating it to @pf_moore) or would it make sense to make the decision in form of vote as laid out in https://www.python.org/dev/peps/pep-0609?

Packaging PEPs are still handled by the SC/PEP delegate. A goal of PEP 609 is to document that process, but there’s no (immediate) intention to change it.

In the case of these PEPs, I expect them to be formally submitted for a decision, at which point I’ll review and publish my decision. The SC has veto over that, should they wish to intervene, but they don’t normally do so.

1 Like

What do I need to do?

Just say it’s ready for pronouncement - nothing fancy :slightly_smiling_face: I’m happy to take the above comment as confirmation that PEP 631 is ready, I just need @EpicWink or @abn to confirm PEP 633 is ready, as I want to review the two together.

1 Like

There are a couple of minor inconsistencies in the examples and a typo or two that I’ve just spotted, but the PEP (and especially the specification) is ready.

1 Like

Pronouncement on PEPs 631 and 633

After much deliberation, I’m happy to announce that I am approving PEP 631 - “Dependency specification in pyproject.toml based on PEP 508”.

As a consequence, PEP 633 - “Dependency specification in pyproject.toml using an exploded TOML table” is rejected.

Thanks to the authors of both PEPs for all the work they put in, and to the many participants in the various discussions. This has not been an easy decision to make, and I don’t imagine it will be to everyone’s liking, but I hope that now the question has been resolved, we will all be able to work towards finalising PEP 621 and moving forward from there.

Clarifications on the wording of PEP 631

PEP 631 includes some rather confusing wording, in that it says values must be an “array of strings or inline tables”, but then goes on to say that inline tables are for future expansion, and backends MUST error if they encounter inline tables “unless the specification is extended”. I consider this a pointless distinction, as making inline tables allowed can be done when (if!) the specification gets extended, so there’s no need to make an explicit provision now.

As a result, I interpret the spec as effectively saying that only an array of strings is allowed, and I would encourage @ofek (as PEP author) to clean up that part of the PEP to make that explicit. Doing so will not affect the decision on the PEP, which is based on my reading of the practical implication of the text.

(Note that the inline table provision is left over from a rejected option to have a link to a file of requirements, and was never intended to allow for any form of “exploded table” format).

Reasons for the decision

Surprisingly, I did not in the end find that “readability” was a significant factor in the decision. Both approaches looked “OK” for straightforward uses, and both got messy for more complex cases. The level of punctuation needed is down to the fact that PEP 621 (and pyproject.toml as a whole) uses TOML syntax, and that applies equally to both proposals.

Unfortunately, neither PEP had particularly compelling arguments motivating their proposal - PEP 631 was particularly disappointing here as it lacked any “Motivation” section at all. As a result, I relied heavily on the overall “tone” of the discussions, along with my own views, to infer motivating arguments.

In addition, PEP 633, as the more complex proposal, had some flaws which concerned me (for example the translation rule noted here is not covered in the PEP - maybe the intention was to disallow the git@github.com:aio-libs/aiohttp.git form, but that was never stated in the discussion or in the PEP). I’m sure they could all be addressed, but the point of submitting a PEP for approval is that in the authors’ view, it has covered all of the details, so I did consider that a weakness of PEP 633.

The key things that ultimately affected the decision were consistency and compatibility.

In terms of consistency, as an existing standard, PEP 508 strings are used in a lot of places throughout the Python packaging ecosystem - package metadata as stored in METADATA files and reported by importlib.metadata, requirements files, installer command line arguments, tox configuration, etc. PEP 633 doesn’t propose any change for those use cases (nor should it, those areas are out of scope), but as a consequence, under that proposal users would need to deal with both formats. That’s not a showstopper, but I believe that it’s easier for users if we stick to a single format.

As far as compatibility is concerned, setuptools and flit already use PEP 508 strings for dependency specification. For users of those tools, moving to PEP 631 simply involves copying the existing value to a new location, whereas PEP 633 involves learning a new format and translating. Again, that’s not a major issue, but it is a hurdle for users, and in a world where Python packaging gets negative feedback for the level of change that’s going on, I’d prefer to avoid it if possible. Poetry users already use a TOML format, but it’s not the same as PEP 633, so they’d have to change either way.

Overall, “status quo wins”, plus PEP 631 being a significantly simpler proposal, which fits more closely to existing tools and with fewer rough edges, was the deciding factor.

What’s next for PEP 621?

I’d like to see PEP 621 updated to include the proposed form from PEP 631 directly, rather than including PEP 631 “by reference”. I don’t think there is sufficient complexity to warrant keeping the dependency syntax in a separate document.

With the question of dependencies settled, I think the biggest remaining task is that PEP 621 needs to strengthen its statement of what benefits it provides. If all it offers is a common user interface for backends to use, then I think that making it a mandatory standard is probably too strong. What I’d like to see (and what, in my view, would put PEP 621 firmly into the area of being an “interoperability standard”) would be if we could allow consumers to treat metadata read from pyproject.toml as canonical. That would extend the benefits significantly beyond “people who want to be able to change backend without rewriting the metadata definition”. It would also remove a significant awkwardness in the current PEP, where the data is present, but we’re telling people not to use it unless they are writing a backend…

With a sufficiently strong set of benefits, I think PEP 621 will then be pretty much ready for submission (although ultimately that’s for @brettcannon to decide).

11 Likes

@ofek this is essentially what I had in mind to clarify PEP 631. Feel free to use it if you want to.

1 Like

Agreed. I’ll plan on doing that.

I will have a chat with @pf_moore about what this might entail to get PEP 621 to a point of potential acceptance and update the PEP accordingly to start a discussion around whatever changes that incurs.

Since I’m the one who suggested this change, I would like to argue for keeping it in or clarifying it differently. The reason for “this may be an array of strings” is that we were attempting to allow for a relatively smooth process of extending the spec from projects other than backends. I specifically asked for clarification on the point of whether the spec is aimed at non-backend tools as well and you said it was.

For backends, I agree with your reasoning that we can deal with extensions to the spec as they come up, because anyone using new features would also choose a backend that supports the new features. For tools that consume other peoples’ projects, though, the situation is different, because an “analyze everything on github” tool may start breaking if we add an “extension” to the spec of this nature (since it is rejecting anything that’s not an array of strings).

My suggestion is that backends must throw an error (after all they don’t know what to do with the information), but non-backend tools should treat the field as dynamic (thus leaving it to the backend to decide what is and is not in spec.

In this case, I thought it was worth calling this out because “we might allow for an inline table instead of a string here” is a pretty versatile way to painlessly extend the spec, and pretty much the only one I envision us using (if we ever extend the spec by changing the type of a field rather than by adding new fields).

Obviously we can bump this from PEP 631 if we want to make it a more general principle in PEP 621 that out-of-spec fields are treated as dynamic and delegated to the backend for enforcing spec compliance.

Isn’t the same true if we extend (for example) description to allow {file = "summary.txt"} as well? I get your point, but I don’t think it’s exclusive to dependencies (although as usual, dependencies might be the most likely case where it’s needed…)

This could be something to revisit for PEP 621 as a whole - it’s essentially a version of “be lenient in what you consume, and strict in what you produce”. You could suggest some wording for PEP 621 to the effect that consumers should be prepared to handle future extensions to the spec gracefully.

Yes, this is what I meant by:

I’m happy to have this discussion in a PEP 621 thread instead. If we go about this, we need to decide whether we’ll have a blanket recommendation where if you see something you don’t understand, kick it to the backend or if we want to carve out specific areas where we think we’re likely to need extensions. Obviously a blanket rule is simpler and more likely to be implemented more universally, but the downside is that if the general rule is, “If you see something out-of-spec, call the field dynamic and throw it to the backend”, then it means that almost all error conditions would require constructing an isolated build environment and invoking the backend so that the backend can tell you if it’s an error or not. If we “lock in” the fields we don’t think we’ll ever change, then in many common cases non-backend scripts can error eagerly.

1 Like

Yep, that sucks. This is why I’m not convinced we should do this, it’s a big cost to pay for something that’s basically trying to predict the future.

Given that I’m still a fairly strong -1 on allowing dependencies to be referenced from an external file, and anything less controversial can just be added to PEP 621 now, I don’t personally have any example of somewhere were we need this flexibility.

I am also -1 on loading from files, but this kind of thing came up as an advantage of exploded TOML formats, that they can be easily expanded in a backwards-compatible way to cover things that might be appropriate in a pyproject.toml but wouldn’t necessarily make sense in PEP 508. The most reasonable one is “read this from a file” (which we’re effectively foreclosing forever at this point if we don’t include a provision for extensions). Another possibility would be some metadata about the dependency that is relevant to the backend, but either isn’t going to be translated into metadata or is going to be translated in a way that would be awkward to represent otherwise.

I could imagine a future where we want to flag a specific dependency as “recommended” rather than strict, or tag it as part of a non-extra grouping — for example, if we move to a world where it’s possible to specify different backends with their own groupings, you might have the default value be “flask + werkzeug” but if you choose the [django] backend it pulls in django + some Django plugins.

I don’t know that it’s amazingly likely that we’ll ever need this, but it addresses the “easily extensible” argument for exploded tables.

I also think there’s a reasonable case to be made for doing this only for dependencies, because:

  1. Dependencies are probably the most complicated thing but also one of the most important pieces of metadata. It is likely to be a place where people get up to a decent amount of chicanery and keeping flexibility available so that we can comfortably expand the spec if we find that people are chafing under the existing mechanisms and just using the dynamic escape hatch all the time.
  2. Dependencies are also the place where people are disproportionately likely to fall back to dynamic anyway, so I expect many parsers will need to build in the fallback mechanism anyway.

Another possible option is that we can add in an explicit pyproject-version field somewhere, to indicate which version of the spec you are using. To opt in to new backwards-incompatible features, you would have to set the version flag to the minimum version that supports all the features you need, and tools that see a version higher than they understand would be expected to switch to the “if I see something I don’t understand, ask the backend” behavior. That would probably be annoyingly manual, but it could be made relatively painless by having backends check to see if the pyproject.toml has the wrong minimum version and raising an exception that says what the correct minimum version is.

Another use case would be for defining editable installs (unless that’s added to PEP 508 grammar).

I doubt we should really be supporting adding dependencies on editable installs of a package, that wouldn’t really make much sense (despite the fact that I’m aware that people do it anyway in their requirements.txt files).

That said, I could imagine a future where we add other modes of installation (e.g. “no-CLI” or one of the various flavors of “install a different version of a package based on options” that are perennial proposals) that would make sense to specify as a dependency.

That said, any sort of “install mode” that we introduce would probably need to be introduced into PEP 508 anyway, because we’d want to be able to specify that mode in requirements.txt and tox and all the other places that aren’t a METADATA file. The main reason for not doing such a thing for editable installs is that, IMO, it makes no sense to specify a dependency on an editable install (which is the same reason we wouldn’t want to add that capability to PEP 621).

Edit: I see the rejected PEP 633 does mention this in the examples.


I couldn’t find it mentioned in any of the PEPs, but are there plans to standardize the equavalent of Poetry’s:

[tool.poetry.dev-dependencies]
pytest = "^3.4"
...

Motivation

The reason I ask is that currently many projects use requirements.txt and something named similar to dev-requirements.txt. The dev- one is especially not standardized, so tooling (IDE’s, etc) have issues finding them automatically.

Using something like pip-tools, many of my projects have a requirements.in, dev-requirements.in, and locked requirements.txt and dev-requirements.txt. If pyproject.toml could be standardized on how development requirements are also described, there could be a common conventions that all IDEs and tools (pip, pip-tools, tox, Poetry, pipenv, etc) could follow to describe all dependencies for a project.

It would be really great if the migration to a common pyproject.toml format could begin to remove the need for requirements.in/txt and friends. It could be up to the specific tools (pip-tools, Poetry, etc,) how lock files are made until there is consensus on that.

Example

[project]
dependencies = [
  'distro >= 1.5.0, < 2',
]
dev-dependencies = [
  'black',
  'isort',
  'pytest',
  'flake8',
]

[project.optional-dependencies]
socks = [ 'PySocks >= 1.5.6, != 1.5.7, < 2' ]

Prior Art

Node’s package.json works well at describing both dependencies and development dependencies in a single file.

{
  "dependencies": {
    "@material-ui/core": "^4.11.0"
  },
  "devDependencies": {
    "prettier": "2.0.5"
  }
}

Rust’s Cargo also has a similar feature.

[dev-dependencies]
pretty_assertions = "0.4.0"

That’s off-topic for this as that would require a separate PEP to update PEP 440 which this discussion isn’t involved in (see a matching issue in the Poetry issue tracker on this topic of standardization).

See How to specify dependencies: PEP 508 strings or a table in TOML? for the resolution. Future discussions can occur under new topics.