PEP 621: how to specify dependencies?

Can you detail what’s the goal of this topic? I can see equal arguments for both sides, and probably just boils down to a matter of taste. Will we put it to vote? Or just to give a platform to users to express their views? For what it’s worth I support using the PEP-508 format, cleaner to read IMHO.

2 Likes

If PEP 508 is good enough to express what needs to be expressed here (or can be made good enough with a reasonable amount of modification), then I think I’d prefer using this notation, so that there is some consistency (at least within PyPA tools, requirements.txt, on the command line, etc.).

But I get the advantage of having some of the validation delegated to the file format parser itself. Not entirely sure, it’s worth it though, the Toml validation would still deliver lots of false positives anyway.

I don’t know enough about Toml, but would it be possible for a field to accept both notations: either a PEP 508 string or an exploded table (assuming both notations stay compatible with each other)?

As per https://github.com/uranusjr/packaging-metadata-comparisons/blob/master/topics/dependency-entries.md#disadvantages-1, I think not being able to select different versions based on markers is a huge downside.

At Datadog, this feature is critical in how we define the dependencies of integrations that are shipped with the Agent.

The marker-dependant version is possible with Poetry, which also uses a table format. It is a restriction of the Pipfile format, not TOML tables in general.

I purposefully didn’t give any examples as there is no exact agreement on what the proposal would be until we know what style people prefer. If you want preexisting ideas you can look at what Flit, Poetry, and Pipenv each do (I linked to the appropriate parts of their docs in my opening post).

If you really want a strawman for each:

requires = [
    # simple
    "colorama",
    # complex
    "win32[all] >2.2.0,<3.0.0; os_name == 'nt'",
]

versus

# simple
colorama = {}
# complex
win32 = { version = ">2.2.0, <3.0.0", extras = ["all"], markers = "os_name == 'nt'" }

Yes, although we could potentially be convinced otherwise to leave it out for some reason.

Remember that this PEP is to help standardize what we reasonably can that build tools need to build a wheel, so that makes sense. :slight_smile:

Depends on which format people prefer. PEP 508 has this baked in, the table approach typically still takes a string for the version specifier which can be separated similarly to PEP 508.

The key distinction is how you list all the parts of the dependency. So with PEP 508 you just do it as you do today in tools like setuptools and tox. In the table format you would probably have a version field, markers field, etc.

To have a conversation about how to specify dependencies as the authors of the PEP couldn’t reach an agreement among themselves.

Quite possibly. :slight_smile:

Right now the idea is to see if there is any consensus from the community around one versus another approach. If there’s not then I don’t know how we will decide. Maybe we punt on it and leave it out of the PEP, maybe a vote, maybe I choose (people already blame me for PEP 518 anyway :wink:), maybe I flip a coin.

By definition it is.

We already rejected that idea:

That is specific to Pipfile, not inherent to a table solution. There is nothing saying

django = [
    {version=">=2.0", markers="os_name!='nt'"},
    {version=">2.1", markers="os_name=='nt'"}
]

(That might require TOML 1.0.)

It’ll only need TOML 1.0 if you’re mixing it with strings. This specific form will work just fine on the existing TOML 0.5.0 or 0.4.0 parsers.

1 Like

This is a human-edited file, so we should use the DSL (i.e. flit-style).

In an ideal world, we’d have a library to parse it into some suitable in-memory data structure, and that structure might even serialise easily as tables for exchanging in an externally verifiable syntactic form, but we should not ask people to write that by hand.

If we do this, packaging will gain this functionality – to be able to generate a Requirement object from this form (taking the dictionary loaded from the TOML file).

2 Likes

The PEP 508 syntax is already used elsewhere (e.g. pip install win32[all]). If a table format is used, people will still probably need to know the PEP 508 syntax, at least to some degree.

Maybe that’s a tradeoff worth making, I don’t know :upside_down_face: I wasn’t a fan of the strawman table-style example:

win32 = { version = ">2.2.0, <3.0.0", extras = ["all"], markers = "os_name == 'nt'" }

but it suddenly looks a lot nicer (IMO) if you expand it out:

[requires.win32]
version = ">2.2.0, <3.0.0"
extras = ["all"]
markers = "os_name == 'nt'"

If you choose string-style, you’ll have to build on top of PEP508 to expand functionality, or in the future allow tables for requirement specification. If you choose table-style, then expanding functionality is trivial.

Is the idea of supporting either format rejected indefinitely, or would you be open to it in the future?

Whatever specified in pyproject.toml needs to eventually be serialised into Core Metadata, so PEP 508 still needs to be expanded even if we build on top of the table style. The difference between the two formats is strictly only readability/writability (and potentially ease to parse and validate).

Dependency specification is already complex, so another advantage of forcing new complexity to fit into a DSL is to drive simplicity.

Or alternatively, it will encourage packages to find better ways to offer a compatible interface rather than making the consumer do all the work.

1 Like

This is a human-edited file, so absolutely we should use a style optimised for people to write and edit it.

However, that doesn’t necessarily mean expanded style. The expanded style is (maybe) easier for non-experts, but the PEP 508 style is concise, which suits some people’s preference. (Yes, I’m someone who prefers a concise style, and I’m an expert, so I’m in favour of PEP 508 on a personal level).

Ultimately, we have a range of users so we may need to support both styles. If we choose just one style, it will be a matter of making a trade-off - inconveniencing one group in order to support another.

Also, PEP 508 style is just as much a DSL as table-style. It’s optimised for a different use case, is all. The question is whether this use case (human edited requirements definitions) better matches one or the other of the available DSLs…

To ensure that PEP 621 doesn’t inadvertently allow the definition of dependency declarations that can’t actually be published as part of the resulting artifact metadata, I think it makes the most sense for it to specifically use PEP 508 markers.

However, I’d suggest a hybrid of the examples Brett gave, where a table is still used to separate the dependencies on different packages, but the values within the table are just PEP 508 strings rather than subtables:

# simple
colorama = "*"
# environment dependent
django = [
    ">=2.0; os_name!='nt'",
    ">=2.1; os_name=='nt'" # Affected by Windows-specific bug in 2.0
]
# with extras defined
win32 = "[all] >2.2.0, <3.0.0; os_name == 'nt'"

The normal case would just be a single string, but a list of strings would also be allowed to cover the “multiple mutually exclusive and/or mutually compatible environment markers” case.

It isn’t as self-documenting as the table version if you don’t already know the PEP 508 syntax, but it’s much easier to translate into explicit pip install commands

As noted in Brett’s initial post, Pipfile/pipenv mostly went with the “table-with-subtables” option, but there’s a shorthand for the simplest case similar to the one I describe above: if you’re only specifying a version constraint, you can just use a string.

That means the simplest way to specify a dependency is as:

pkg_name = "<PEP 440 specifier>"

The wildcard string pkg_name = "*" is a non-PEP-440 shorthand for “any version”, since pkg_name and pkg_name= aren’t legal TOML, the full spelling (pkg_name = "== *") is quite verbose, but pkg_name = "" and pkg_name = {} didn’t feel like they conveyed “any version” strongly enough.

poetry offers a similar shorthand for version specifiers (although the syntax for some of the comparison operators differs from PEP 440).

While anything more complex than that uses a subtable (typically written in the inline-dict format, rather than as a full multi-line table), many of the options that Pipfile and poetry support (e.g. editable installs, local path installs, direct URL installs, direct-from-VCS installs) are ones that should only arguably be supported in pyproject.toml as part of intended-for-publication dependency declarations, and it’s those local development and private deployment focused features that either aren’t covered by PEP-508 or are covered by PEP 508 but aren’t allowed in PyPI uploads (and hence aren’t very well known or supported by libraries) that really motivated using the table format.

It’s OK for poetry & pipenv to support unpublishable dependency declarations, as they’re both used as environment managers instead of or as well as package build tools (exclusively in pipenv’s case, optionally in poetry’s case). By contrast, it’s not really OK for PEP 621 to support unpublishable dependency declarations.

I’m aware that using full PEP 508 markers in PEP 621 would likely create demand for Pipfile/pipenv and poetry to also support the extended input format, but I don’t see that posing a major problem in the long run (the tools already cover the relevant semantics, so it should mostly just require adjustments at the file parsing layer).

1 Like

One small worry I have with specifying package names as keys (a top-level win32 = ...) is that, at some point, someone will probably try to depend on something like importlib.metadata and be tripped up by TOML’s dotted-key syntax.

That seems like another motivation to require normalized names. https://www.python.org/dev/peps/pep-0503/#normalized-names

Wait, does flit use the expanded table? I thought it used PEP 508.

I definitely meant PEP 508 style. Hopefully Paul is just bringing in some additional concerns, and not disagreeing with me while also agreeing with me :slight_smile:

Apologies. I’m not familiar with flit so I misread “the DSL” as meaning table-style.

Apparently I was agreeing with you while confusing the issue. Sorry :slight_smile:

1 Like

Thanks for bringing this up. I made a similar suggestion in the pre-PEP private conversation, but it never managed to catch on among the authors. Let’s give it another shot since I still believe this is an acceptable middle-ground.

The only difference in my proposal was to use this form in the simplest example:

simple = ""  # Instead of "*"

The main reason I prefer this is it’s easier to create a PEP 508 string from the table:

from packaging.requirements import Requirement

requirements = [
    Requirement(f"{key}{val}")
    for key, val in pyproject["dependencies"].items()
]
1 Like

Yeah, while I was in favour of the star for Pipfile, I now agree it would have been better to stay fully PEP 440 compliant and use the empty string for “any version”.

If PEP 621 opts to use the empty string, I expect Pipfile will follow suit, making the explicit star purely a backwards compatibility feature.

1 Like