PEP 621: Storing project metadata in pyproject.toml

Would be nice to address this in the PEP.

Definitely needs to be included then.

I think this is useful. Like 75% of the other tools enscons also takes basically the setuptools setup() arguments in pyproject.toml under its own [tool.enscons] table. These are converted directly into the METADATA file. (Is there a PEP to METADATA implementation for us to use?). The divergence from setuptools happens in the build system, which determines how files, not metadata, winds up in the package. Those two processes appear to be totally independent of each other.

I can’t find such a case in the PEP.

I don’t think so. I’ll update the PEP.

This is covered by:

Tools MAY support alternative content-types which they can
transform to a content-type as supported by the `core metadata`_.

Everyone is going to have a different opinion as to what the most important reason why this PEP should exist. I don’t think arguing about the order is worth it as long as the key motivations for everyone is somehow captured.

Yes, it would be a dynamic provide.

As for an example, it’s just specifying dynamic = ["version"] and however setuptools choose to let people specify that as the way to get the version so I’m not sure what the benefit would be in tossing in such an example.

What the PEP is doing is what you’re suggesting, but standardizing on “author” instead of “maintainer”. I’ll call that out.

This is actually already a compromise of even allowing tools to do this as at least one person wants to just ditch all the trove classifiers we said to backfill. So we purposefully made it weak and underspecified as the assumption is the importance of the relevant classifiers will actually go away in the future.

That’s a PyPI question for which I have a year-old issue about. :grin:

I’ve opened https://github.com/python/peps/pull/1465 to address the above and will merge it once one other co-author approves the PR.

@pradyunsg already fixed it: https://github.com/python/peps/pull/1459

1 Like

10 posts were split to a new topic: The author/maintainer distinction problem

Ah, I misread, I thought there was a readme.charset field.

Entries that I’m using in some of my projects’ setup.cfg that I didn’t see in the PEP:

  • package_data
  • data_files
  • platforms (not I ever use anything else than any here)

I suspect platforms was discussed and it was decided that it wasn’t in use enough to be included, especially as it’s recommended to be used only when the platform isn’t in the (fairly comprehensive) trove classifiers. Perhaps that should be stated in the “rejected ideas” section. In any case, I think the build-backend has a pretty good idea as to which platforms support the packages it builds.

Both package_data and data_files are build options, which you can see are under [options] in setup.cfg. They’re specific on how to build the package, not on its metadata

1 Like

It actually wasn’t, but your reasoning is correct. No tools surveyed (except setuptools) support this field, as far as I know. The specification does not aim to cover and replace eveything needed to generate Core Metadata, only those commonly used and have well-established static representations in the community. We can always add additional fields to the table if they become widespread and cause unnecessary migration overhead.

Please see “Specify files to include when building” heading under “Rejected Ideas”.

The fields you’ve mentioned don’t have a corresponding core metadata field, and are related to how the package is built - this is explicitly out of scope for this PEP.

I’d like to point out that, broadly speaking, I am in favor of this initiative (though I would guess I am the most skeptical of all the listed authors) — I agree with @bernatgabor’s point that this seems to be trying to go back to the bad old days of a single backend, but in practice there are just not that many ways to specify the package metadata (considering it all maps to the same static fields anyway), and this takes a decent amount of cognitive load off of backend designers.

That said, I want to point out a few concerns I have with the messaging around this PEP, because people are already hopelessly confused by the situation with pyproject.toml, and this looks likely to make it worse.

  1. People already think that pyproject.toml is some sort of replacement for setup.py and are constantly asking non-sensical questions like, “Should you use pyproject.toml or setup.py?” or “Can you achieve this with pyproject.toml?”. This proposal will make this 100x worse, because the word for “projects that use PEP 517/518 to specify build-system requirements” is still "pyproject.toml projects", and now you will also have metadata specified in pyproject.toml.

    It was already confusing enough to say, “The standardized pyproject.toml is about specifying information about your build system, and some build systems also keep their configuration there.” Now we have to go even more nuanced with, “There are two standardized tables in pyproject.toml, one is about specifying what build system you are using, and the other is a common format for package metadata that is used by different build systems. Many build tools also use pyproject.toml for their configuration about how to do the build (e.g. what to include in the package).”

  2. The Motivation section will confuse people even more, because it strongly implies that this is the solution to the problem that it’s difficult or impossible to get metadata without actually building the project, but this is not actually a good solution to that problem. In order for this to be a good solution to the problem, it would have to be widely adopted and most people would have to avoid any sort of dynamic metadata.

    An opt-in standard for the input to the core metadata files will make the easy path easy, but it won’t provide a more general solution as would be provided by standardizing the way the output of the “calculate core metadata files” gets included in sdists. People already don’t think terribly deeply about things like this, and I think the fact that the rhetoric in the Motivation section doesn’t match the actual PEP will make people even more confused about what this PEP does.

It is not encouraging that some of the only discussion of this I’m seeing on social media seems way off-the-mark in terms of what this will change: this post highlights that this “allows specifying multiple maintainers” (which implies that it actually changes something about the underlying metadata spec) and this post seems to think that this is a necessary step to allowing the use of pyproject.toml as “a complete alternative to setup.cfg/setup.py”.

Of course, I don’t know how to fix the problem of messaging, since Brett is pretty high-profile within the Python community and he’s been out there giving clear and concise explanations of this on podcasts and in blog posts, and it doesn’t seem to be penetrating (even among people like… tweeting about his clarifying posts). Still, it’s worth thinking about settling on a communication strategy and at least trying to address the ongoing problems we’re having communicating this information to the community.

3 Likes

Since @pganssle has expressed interest in discussing it further (and made an interesting suggestion for how we could tackle the Author/Maintainer situation), I’ve flagged this thread to request the discuss.python.org moderators to split out the discussion on the Author/Metadata fields into a dedicated thread.

@mods please keep this post in this thread, and I’ll edit it later to include the link to the new thread. :slight_smile:

Edit: The author/maintainer distinction problem it is!

I think this is 100% true, and giving separate names to pyproject.toml-based “build processes” as well as “configuration” is the way to go here. I really want us to improve the communication in this area.

I do think we shouldn’t have this discussion in this thread, and would like to suggest Name for pyproject.toml builds and PEPs as a better location for discussion the messaging and naming around pyproject.toml. :slight_smile:

Something I don’t see specified in the PEP, is in how to actually consume metadata, and how this PEP changes that.

So currently we have 3 types of “artifacts” in which we might want to introspect the metadata:

  • Wheels
  • Sdists
  • Arbitrary Directory

For terseness sake I’m going to speak only in a PEP 517 world, but the same ideas map to setuptools before it.

For wheels, I assume this PEP has no real effect on it, you wouldn’t expect to see a pyproject.toml file inside a wheel still, and you’d still be expected to parse the METADATA file as you do today.

However I’m unsure how this changes things for the other two cases… Currently for them you basically call an API which generates a METADATA file/structure and then parse that. With this PEP are we expecting consumers to start parsing pyproject.toml for metadata, and then fall back to the APIs and parsing a different format? Or are we expecting these to basically only be read directly by build tools, and consumers should still expect to only interact with the METADATA structure?

3 Likes

The section in the PEP saying

Finally, this PEP makes core metadata for projects statically defined. By being statically defined, metadata can be read more quickly and easily than if it were dynamically calculated. Tools which read and write metadata can do so without worrying about which build back-end the user specified the metadata for.

is intended to sanction this use¹. However, it explicitly wasn’t a goal to replace the existing mechanisms for extracting metadata, and that ambivalence comes through in the PEP.

Personally, I’d be strongly in favour of explicitly allowing this use (and exercising that right in pip, to extract name, version and dependency data as efficiently as possible). But I can’t speak for the other PEP authors over this.

¹ Although the presence of dynamic makes doing so a little more complex, as tools need to have a fallback mechanism in place.

I’m happy to tweak the motivation section as I personally don’t view it as that critical in how it’s written. Feel free to send a PR to change its wording or add a section that you feel better represents what you’re after.

No explicit expectations are being set, but that’s because I would argue sdists and arbitrary directories have no spec to begin with. :slight_smile: I don’t expect an sdist to have a METADATA file and I’m not aware of any PEP that says so either. If we decided to standardize sdists – which I hope we do someday – then it could be a possibility to say that pyproject.toml is expected to have [build-system] specified, but I personally wouldn’t go passed that requirement for quite some time.

The original reason we didn’t do this was due to the setuptools-scm users wanting their version number to be dynamically calculated. But perhaps pip can strongly encourage folks via benchmarks or something that using this speeds up installs?

@bernatgabor @pganssle and anyone else who has had issue with the Motivation section, I have tried to simplify it via https://github.com/python/peps/pull/1469. It also says more about what the PEP isn’t.

As before I won’t merge it until another co-author approves it.

(Somewhat off-topic, so if this needs further response I suggest we open a new thread) Pip already tries very hard to avoid a build step when getting metadata. I’d see pyproject.toml as being most useful for dependency data (which isn’t available elsewhere until you build) and for “source trees” (which have no filename to parse).

Dynamic versions annoy me as the maintainer of a metadata consumer, but the benefits of getting rid of them are likely far more to do with code complexity than performance (sadly).

2 Likes

I have merged the simplified Motivation section thanks to Dustin and Pradyun’s reviews!

I think that people would simply not adopt this [project] table if it were not possible to specify at least the version dynamically at build time, and probably many other things. It would be a pretty serious regression, even if you ignore everyone who wants their git metadata to be the single source of the version number.

I would be somewhat surprised if this significantly improved the situation with regards to the ability to dynamically process metadata. It will probably be convenient when someone happens to use it and happens to not use any dynamic fields, but I think the real solution to the “static metadata” problem is to standardize metadata in sdists. Once setuptools and other build tools start rolling out changes that standardize metadata, huge swathes of the ecosystem will be opted in automatically, and it should be canonical. I can imagine situations where “parse the metadata from pyproject.toml without a build” might be useful, I don’t think it will be a significant long-term or short-term solution (unless we get this done and then really drop the ball on metadata-in-sdist, or it gets super wide adoption immediately) for things like building a resolver.

I say this not to just try to shoot down the idea that we should be parsing metadata from this, but to try and frame the discussion a bit about what we should be optimizing for. With regards to “static metadata”, I think the ordering of our priorities should be:

  1. Pave the way for a future where sdists can contain useful and canonical static metadata files.
  2. Make it easy for back-ends to adopt this without any loss of functionality.
  3. Design it in such a way that tools can know when it’s safe to parse static metadata directly from the pyproject.toml.

I think the PEP as it stands does a good job of this: fields that are tool-provided are marked as such, no existing backends care about the few fields that are not allowed to be provided dynamically (as far as I know).

3 Likes