Relaxing (or clarifying?) PEP 621 requirements regarding dynamic dependencies

As I read it, PEP 621 prevents us from having metadata in artifacts we generate differ from its correspondent field in the project table of pyproject.toml.

However, on certain artifacts, particularly binary builds, we might need to add extra requirement entries to reflect compatibility of the artifact. So, my question is, does PEP 621 prevent this?

In meson-python, we are planning on adding a way for users to add extra requirements based on the version of the dependencies we built against. Eg. if I built a wheel against numpy==1.24.2 and pythran==0.12.1, I will want to add numpy>=1.24.2 and pythran==0.12.* to the wheel requirements.
For this, we will let the user specify which “extra” runtime dependencies via a config setting, where they will be able to say which packages they want to pin, and how to pin the version (eg. let them say if they want to pin 1, 1.24, or 1.24.2, etc. for when numpy==1.24.2).

Do we need now to move the dependencies from project.dependencies to a tool-specific field? In my interpretation of the PEP, I think so, yes, though it’s not super clear.

I don’t think preventing backends to add extra dependencies to binary artifacts to reflect extra requirements of that particular build makes sense, so I’d like to add text to PEP 621 to explicitly allow this.

1 Like

You have to declare the metadata as dynamic, yes.

Tools CANNOT remove, add or change data that has been statically specified. Only when a field is marked as dynamic may a tool provide a “new” value.

Also

Build back-ends MUST raise an error if the metadata specifies a field statically as well as being listed in dynamic.

So I think yes, this does mean that you have to set the dependencies as “dynamic” and use a tool-specific way of setting the “base” value.

How would a different backend interpret this? The point of specifying metadata in pyproject.toml is that you can do it in a backend-independent manner.

Given the nuances, as well as the fact that it’s a change to explicitly documented (and from what I recall, deliberately specified) behaviour in the current spec, this would need to be raised as a new PEP.

1 Like

Yes, but I think the tricky bit here is that this is not explicit to what it refers to, because we wouldn’t be adding the dependencies to the project as a whole, but instead just to binary artifacts to reflect details of the build. sdists would stay the same, it’s just that wheels need to expose extra compatibility information.

Not saying you are wrong, as, again, I agree, it’s just that I don’t think it’s that straight-forward.

Also, does this mean that backends that need an extra dependency for editable wheels (like editables) can’t use project.dependencies?

What do you mean? I don’t really follow.

The idea would be that project.dependencies can’t change, but backends can add extra Requires-Dist fields in binary artifacts to reflect compatibility requirements of the build.

That is fair.

I’m thinking that the idea is that the project section is tool-independent. But you are talking about having a pyproject.toml with dependencies only specified in the tool-independent section, and yet having backend-dependent behaviour. That feels to me like it’s against the spirit of having tool-independent data.

If there’s no backend-specific config, that implies that the user could switch backends and not need to change anything - and yet there’s some sort of hidden reliance on the behaviour of the original backend that’s not encapsulated in the pyproject.toml. I don’t know if this is a problem (it’s a weird situation, certainly, and not one I have direct experience of) - but that’s basically what I mean by “nuances”.

The following comment in PEP 660 allows for this exception:

Metadata must be identical as the one that would have been produced by build_wheel or prepare_metadata_for_build_wheel, except for Requires-Dist which may differ slightly as explained below.

It’s worth noting that PEP 643 (Metadata for source distributions) contains the following comment:

If a field is not marked as Dynamic, then the value of the field in any wheel built from the sdist MUST match the value in the sdist.

PEP 643 was approved before PEP 660 and I’d expect the comment in PEP 660 to take precedence over this statement - I’d accept a clarification PR to the spec to note that. PEP 621 was also approved before PEP 660, and while it doesn’t really need a similar clarification, I wouldn’t object if someone wanted to raise a PR to the spec adding one anyway.

2 Likes

There’s no such thing as “metadata for the project as a whole”, all metadata is for artifacts, and static means “the same in all artifacts built from this source”. PEP 643 took pains to make this point explicit, and while PEP 621 didn’t go to the same lengths, I expect the same principles to apply[1]. A PR adding a clarification to that effect would be acceptable.


  1. It’s certainly the interpretation I assumed when I accepted the PEP. ↩︎

2 Likes

FWIW, I came to essentially the same conclusions as Paul during my fairly work on PEP 639, several parts of which relied on specifically how that part was interpreted and implemented. Based on carefully reading (and re-reading) PEP 621 and related standards, as well as reviewing the discussion history and talking about this issue with Brett directly, what has become clear to me is that the overall intent was that any standards-conforming backend must produce the identical METADATA in the output wheels for a given set of inputs in the [project] table (and project source tree) following only the relevant standards, except for any core metadata fields corresponding to pyproject metadata keys marked as dynamic.

The implication of this is that backends can (and certainly do, and must) apply certain deterministic transformations are made to the literals values entered in the [project] table, but those transformations must be specified by an accepted PyPA standard (e.g. PEP 621 itself, or PEP 660, PEP 639, etc). So, if there was a PyPA standard (i.e a PEP, and typically a bump to the Metadata-Version accordingly) that specified adding such wheel dependencies in a consistent and cross-backend way, then it wouldn’t require dynamic. Otherwise, given the backend-specific and possibly non-deterministic, then it would need to be marked as dynamic, yes.

1 Like

Well, “switching backends” is not that trivial since it could still alter the artifacts in ways beyond basic metadata. However, I do agree that changing dependencies this way goes against the principle of least surprise.

Someone who doesn’t know the particular build backend will still recognize a familiar dependencies key and be surprised that it works differently.

Right, but the primary motivation for this provision has more to do with ensuring the pyproject.toml metadata can be relied upon by the broader ecosystem of tools, just like with PEP 543 specifying the dynamic field for sdists. Without this, you have to build a wheel with the chosen backend to get reliable metadata, which is expensive and can involve executing arbitrary code—and even then, it would only be truly authoritative for the specific combination of tags in that wheel.

Thanks for starting this discussion @FFY00, and thanks @pf_moore for pointing to all the relevant sections of various PEPs. It sounds like a new PEP to update what we need here is necessary. As was pointed out in other recent threads too (IIRC the “singular packaging tool/vision one” at least), relaxing this requirement is important. In general, it does not hold any time you use a C/C++ API of another package. Usage of a metapackage like oldest-supported-numpy as a static requirement is also apparently forbidden by the current PEP sections.

I agree with most of what was written before, but not this:

I’m not sure if you meant a technical point like “in a METADATA file” or a philosophical one, but: I believe projects do have metadata. A project does have dependencies (it should be self-evident I hope that those exist also outside of artifacts), and those tend to be listed in pyproject.toml to the extent possible. There is no other place for them. The dependencies in the METADATA file of an sdist are specific to that sdist - and almost by definition they will match the ones in pyproject.toml, because “source code in repo” and “source distribution” overlap a lot. It’s also evident by pyproject.toml being kept in the project’s repository - there are no artifacts there. And by the existence of the GitHub dependency graph, and other such analyses and tools that derive project info from its metadata files.

There is a related conceptual issue here that I have run into before with build-time dependencies. There are two things I’d like to be able to express:

  1. Dependencies of a project,
  2. Dependencies specifically for building wheels for redistribution on PyPI or in an isolated end user build.

Those are not the same thing, and unfortunately we sometimes have to choose because there’s only one field in pyproject.toml.

1 Like

My understanding matches Paul’s. To take an example, what are the dependencies of the project Django? That can’t be answered. A release (project version) has dependencies: none for older Django, one extra for 1.8, pytz and sqlparse for 2.2. (Specific artifacts can have different dependencies, but that’s intended to be phased out since the introduction of environment markers in dependencies.) So to me it’s not a philosophical point but a very concrete one: a project is a name and a collection of releases, and it does not have dependencies.

1 Like

It’s somewhat implementation defined, but the way things are implemented in Python packaging there is no such thing as project level metadata, at least as far as installers are concerned, there is only artifact level metadata. Things like pyproject.toml exist to ultimately produce an artifact for an installer to consume.

As of yet, nobody has come up with a PEP or a plan to move metadata to be not artifact specific, so it’s still fully supported to have different dependencies for different artifacts (though environment markers may make this unneeded in most cases).

PEP 621’s goal of making it easier to switch between backends is, IMO, a relatively minor thing that PEP 621 enables. I do not think most projects switch backends often enough that it really matters, and I think that for all but the simplest of projects, which backend you use likely will change more things than is captured by PEP 621 anyways.

I think where PEP 621 really shines is the ability for tools to pull the “pre” metadata out of a directory that has yet to be an artifact, and have confidence that what they’re reading is what is going to match in what the artifact produces. This is a particularly useful thing, and even unlocks optimizations that are otherwise impossible (for instance, pip supports building things from source, you could imagine getting dependencies from a pyproject.toml in the cases where it can, to avoid building until after we have a resolved dependency set).

This only really works if we can trust that the metadata in pyproject.toml is accurate, and there isn’t going to be some build backend specific logic mutating that information without us being aware of it.

I think as PEP 621 is today, then yes absolutely if you’re going to mutate the dependency list, then you MUST NOT use project.dependencies and you MUST use some tool specific configuration, and ideally dependencies will be marked as dynamic in pyproject.toml (and in the sdist METADATA).

I think it would require a new PEP, but an interesting thing to do might be to relax PEP 621 to allow specifying a field as dynamic AND still specifying the field in the PEP 621 metadata. Then you could treat the dynamic but also specified metadata as a “hint” to what the final metadata might be, but you can’t treat it as accurate because it’s still dynamic. This is maybe useful for some cases? It might blur the lines too much though and be easier to just leave things as it is.

4 Likes

This is somewhat similar to how PEP 643 allows a field to be marked as dynamic while still containing a value. I’m still rather uncomfortable about the idea, as it feels like it could lead to data being marked as dynamic when it could be static (people cargo-culting the idea that you add dynamic = ["dependencies"] to your project “just in case the backend needs to add anything”) but I can see how it might be useful in specialised circumstances. Although I don’t honestly see what’s so bad about having a tool-specific setting for this - it’s not like consumers other than the backend can validly make use of the data.

1 Like

Yea, I don’t think a tool specific setting for this is a bad thing. I was mostly just throwing out an idea I had while reading the thread, in case someone thought it would be really useful for them. Adding it just to add it wouldn’t be a particularly prudent thing to do :slight_smile:

Okay, fair enough. That’s not what I was getting at though, so let me rephrase: at any point in time, the current state of the main branch of a project has dependencies (as does every other branch). The existence of dependencies is not limited to release artifacts. For your example, Django, it looks like they’re in its setup.cfg and its pyproject.toml.

So to rephrase my two points slightly, we have:

  1. Dependencies of the current state of the source code of a project,
  2. Dependencies specifically for building wheels for redistribution on PyPI, and during an isolated build.

pyproject.toml lives in VCS. It’s one file, and it must express dependencies for two different things. To stay with the numpy example from @FFY00’s first post in this thread: for (1) it’d be perfectly valid for a user to build scipy against any supported numpy version, such as 1.24.1. The pyproject.toml content seems to be saying otherwise though, it contains ==1.21.6 pins, because the choice made by the project is to express the dependencies for (2), i.e. they’re set to the values needed for a wheel build for distribution on PyPI.

I’m not sure I agree, or that this was a conscious decision when pyproject.toml was introduced. The language in PEP 518 to support your point of view here seems to be missing. E.g., it starts “This PEP specifies how Python software packages should specify what build dependencies they have”. It does not say “build dependencies to create wheels from an sdist”. So I suspect it was left in the middle, and we have different interpretations.

Either way, we’re in a pretty unhealthy state. It’d be much better if pyproject.toml captured the actual dependencies for the code base it’s included in, separately from wheel-specific constraints. I want to be able to express “foo depends on bar>=1.2.3” (independent of how bar was installed). That’s the more interesting info for a wider audience imho - it determines what features from bar contributors are able to use in the code base, and what the metadata for a binary artifact of foo derived from VCS tag or sdist in any packaging system that wants to include a package for foo should be. Those seem like things worth capturing in metadata.

Agreed. We’re going to do that now in meson-python.

I don’t think it blurs the lines too much, and it is important. I will note that:

  • There are many, many packages that technically cannot express metadata as static at all right now. This includes many of the most popular packages on PyPI for the PyData stack: SciPy, scikit-learn, scikit-image, Matplotlib, statsmodels, most users of Cython, etc.
  • The flexibility is limited here, it’s not like all these projects have fully dynamic metadata. They are only adding one or more constraints, so the wheel dependencies are a strict subset of the sdist ones.
  • This is not a niche thing, it applies anytime one uses a C/C++ API. In fact, this “dependency narrowing” is so important that for CPython it has been encoded in wheel filename metadata. In an sdist we have (a) pyproject.toml, with requires-python = ">=3.8" and (b) PKG-INFO, with Requires-Python: >=3.8. A corresponding wheel can have different metadata, as soon as you use the CPython C API the >=3.8 transforms to a specific minor version like -cp310.
  • If we leave it to tool-specific settings, any other users of pyproject.toml will be unable to support that. E.g., goodbye to the GitHub Dependency graph reporting those dependencies.
  • This post by @steve.dower identified “allow (encourage) wheels with binaries to have tighter dependencies than their sdists” as one of four key things to do to improve interoperability with Conda.

I hope the above is enough to convince you that this is a bad thing, and important to address. The work for that needs to be done through a PEP, but I hope we can agree that it’d be beneficial to put effort into that.

1 Like

Maybe it would better serve the needs you’ve identified as well as the goals of pyproject metadata to standardize a non-tool-specific way for users to specify those wheel-specific dependencies or constraints as e.g. a new key under the [project] table (or a new table, like the proposed [extension] one for C/++/etc extension building metadata)? This would potentially also serve the related needs of users requesting a mechanism to specify different dependencies for wheels vs. sdists, as well as a sought-after replacement for requirements.txt for specifying pinned production deps vs. unpinned development deps.

Of course, in order for this to be practical, the constraints would either need to be declarativly specified in the pyproject.toml, or if added by the tool, done so in a deterministic tool-independent fashion that could be formally specified and implemented consistently, which may or may not be possible for what you’re thinking. Also, one obstacle is that besides meson-python there isn’t a lot of existing tool support for specifying different sdist vs. wheel dependencies that I’m aware of, so it would presumably have to be declared an optional feature, but perhaps more tools would start supporting it once it was formally standardized.

If that approach isn’t viable, then the way I could see this potentially working is your PEP instead specifying that if dependencies was both included in the [project] table and marked as dynamic, then tools would be allowed to constrain the existing listed dependencies further to more precise (or fully pinned) versions in the built wheel METADATA, while using the [project]-listed versions in the sdist PKG-INFO (and not otherwise adding, removing, or modifying the [project]-listed deps in the wheel). That way, while it would weaken the fundamental guarantees for that specific key somewhat, it would only do so as much as necessary rather than throwing it out completely, and clearly define the behavior and what tools can and can’t rely on.

Of course, one thing to mind with either approach is that any existing tool would reject new pyproject.tomls with either such addition, so you’d need some time several years for the changes to propagate enough through the ecosystem after the PEP was accepted, but build-system.requires would soften that blow substantially.

BTW. this sounds like it could potentially be a good topic for discussion at the upcoming PyCon Packaging Summit—not sure if you’ll be there, but @FFY00 will so maybe he could discuss it.

To note, an apparently closely-related non-scientific-Python use case came up on for the PyQt-tools family of projects, which effectively wrap the various Qt/PyQt tools (Qt designer, etc) into Python packages. Their sdists can build against a range of Qt/PyQt versions, but their wheels are build against a specific version and must be used with that version of PyQt. Furthermore, they have a stack of multiple serially-dependent such packages that they maintain, where each additionally depends on the next. This certainly seems like an interesting use case to consider, though one that’s specifically a problem for Python used as “glue” for non-Python binary dependencies.