PEP 621: round 3

  1. Round 1
  2. Round 2

The latest update to the PEP did three major things. One, I merged in PEP 631 so how to specify dependencies is now covered.

Two, as the behest of the presumptive PEP delegate, @pf_moore, I strengthened the wording around how data is considered canonical when specified. Basically it means data in the [project] table cannot be removed or changed to mean something else (adding or making more specific is fine). This means you can rely on the data in the table being accurate and trustworthy.

Finally, also at Paul’s suggestion, the PEP has been updated to say tools SHOULD update/provide pyproject.toml and the data in the [project] table as a way to provide canonical data in an sdist. The PEP now specifies which fields should always be provided if the table is in an sdist, ones that if provided must be static (i.e. not listed in dynamic), and which fields can still be dynamic. For people and tools that use this PEP it means that sdists will have reliable metadata that is still human-readable and relatable back to the way the developer already provided metadata.

PEP: 621
Title: Storing project metadata in pyproject.toml
Author: Brett Cannon <>,
        Dustin Ingram <>,
        Paul Ganssle <paul at>,
        Pradyun Gedam <>,
        Sébastien Eustace <>,
        Thomas Kluyver <>,
        Tzu-Ping Chung <>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 22-Jun-2020
Post-History: 22-Jun-2020


This PEP specifies how to write a project's `core metadata`_ in a
``pyproject.toml`` file for packaging-related tools to consume.


The key motivators of this PEP are:

- Encourage users to specify core metadata statically for speed,
  ease of specification, and deterministic consumption by build
- Provide a tool-agnostic way of specifying the metadata for ease of
  learning and transitioning between build back-ends
- Allow for more code sharing between build back-ends for the
  "boring parts" of a project's metadata
- Provide a way to specify canonical data both by users and in
  source distributions

This PEP does **not** attempt to standardize all possible metadata
required by a build back-end, only the metadata covered by the
`core metadata`_ specification which are very common across projects
and would stand to benefit from being static and consistently
specified. This means build back-ends are still free and able to
innovate around patterns like how to specify the files to include in a
wheel. There is also an included escape hatch for users and build
back-ends to use when they choose to partially opt-out of this PEP
(compared to opting-out of this PEP entirely, which is also possible).

This PEP is also not trying to change the underlying `core metadata`_
in any way. Such considerations should be done in a separate PEP which
may lead to changes or additions to what this PEP specifies.


The design guidelines the authors of this PEP followed were:

- Define as much of the `core metadata`_ as reasonable
- Define the metadata statically with an escape hatch for those who
  want to define it dynamically later
- Use familiar names where it makes sense, but be willing to use more
  modern terminology
- Try to be ergonomic within a TOML file instead of mirroring how
  tools specify metadata at a low-level when it makes sense
- Learn from other build back-ends in the packaging ecosystem which
  have used TOML for their metadata
- Don't try to standardize things which lack a pre-existing standard
  at a lower-level
- *When* metadata is specified using this PEP, it is considered
- Make the specified data useful in a source distribution to
  statically define what metadata is known at the time of source
  distribution creation


When specifying project metadata, tools MUST adhere and honour the
metadata as specified in this PEP. If metadata is improperly specified
then tools MUST raise an error to notify the user about their mistake.

Data specified using this PEP is considered canonical. Tools CANNOT
remove or change data, but they MAY add to it. This allows for tools
to make data more accurate/static when possible by updating the data
specified in the ``pyproject.toml`` file. For example, a version
can become more specific when building a wheel (e.g. adding a local
version), but it cannot become less specific.

Build back-ends creating a source distribution -- aka an "sdist" --
SHOULD provide as much data as possible using this PEP within a source
distribution. The ``name`` and ``version`` fields MUST NOT be omitted
and must be statically specified. Other fields which pertain to data
surfaced on PyPI, and thus are not expected to be determined at wheel
creation time, MUST NOT be listed as ``dynamic`` in a source
distribution. All other fields have no specific requirements placed
upon them in a source distribution.


Table name

Tools MUST specify fields defined by this PEP in a table named
``[project]``. No tools may add fields to this table which are not
defined by this PEP or subsequent PEPs. For tools wishing to store
their own settings in ``pyproject.toml``, they may use the ``[tool]``
table as defined in :pep:`518`. The lack of a ``[project]`` table
implicitly means the build back-end will dynamically provide all

- Format: string
- `Core metadata`_: ``Name``
  (`link <>`__)
- Source distributions: required
- Synonyms

  - Flit_: ``module``/``dist-name``
    (`link <>`__)
  - Poetry_: ``name``
    (`link <>`__)
  - Setuptools_: ``name``
    (`link <>`__)

The name of the project.

Tools MUST require users to statically define this field.

Tools SHOULD normalize this name, as specified by :pep:`503`, as soon
as it is read for internal consistency.

- Format: string
- `Core metadata`_: ``Version``
  (`link <>`__)
- Source distributions: required
- Synonyms

  - Flit_: N/A (read from a ``__version__`` attribute)
    (`link <>`__)
  - Poetry_: ``version``
    (`link <>`__)
  - Setuptools_: ``version``
    (`link <>`__)

The version of the project as supported by :pep:`440`.

Users SHOULD prefer to specify already-normalized versions.

- Format: string
- `Core metadata`_: ``Summary``
  (`link <>`__)
- Source distributions: cannot by dynamic
- Synonyms

  - Flit_: N/A
  - Poetry_: ``description``
    (`link <>`__)
  - Setuptools_: ``description``
    (`link <>`__)

The summary description of the project.

- Format: String or table
- `Core metadata`_: ``Description``
  (`link <>`__)
- Source distributions: cannot by dynamic
- Synonyms

  - Flit_: ``description-file``
    (`link <>`__)
  - Poetry_: ``readme``
    (`link <>`__)
  - Setuptools_: ``long_description``
    (`link <>`__)

The full description of the project (i.e. the README).

The field accepts either a string or a table. If it is a string then
it is the relative path to a text file containing the full
description. Tools MUST assume the file's encoding is UTF-8. If the
file path ends in a case-insensitive ``.md`` suffix, then tools MUST
assume the content-type is ``text/markdown``. If the file path ends in
a case-insensitive ``.rst``, then tools MUST assume the content-type
is ``text/x-rst``. If a tool recognizes more extensions than this PEP,
they MAY infer the content-type for the user without specifying this
field as ``dynamic``. For all unrecognized suffixes when a
content-type is not provided, tools MUST raise an error.

The ``readme`` field may also take a table. The ``file`` key has a
string value representing a relative path to a file containing the
full description. The ``text`` key has a string value which is the
full description. These keys are mutually-exclusive, thus tools MUST
raise an error if the metadata specifies both keys.

A table specified in the ``readme`` field also has a ``content-type``
field which takes a string specifying the content-type of the full
description. A tool MUST raise an error if the metadata does not
specify this field in the table. If the metadata does not specify the
``charset`` parameter, then it is assumed to be UTF-8. Tools MAY
support other encodings if they choose to. Tools MAY support
alternative content-types which they can transform to a content-type
as supported by the `core metadata`_. Otherwise tools MUST raise an
error for unsupported content-types.

- Format: string
- `Core metadata`_: ``Requires-Python``
  (`link <>`__)
- Source distributions: cannot by dynamic
- Synonyms

  - Flit_: ``requires-python``
    (`link <>`__)
  - Poetry_: As a ``python`` dependency in the
    ``[tool.poetry.dependencies]`` table
    (`link <>`__)
  - Setuptools_: ``python_requires``
    (`link <>`__)

The Python version requirements of the project.

Build back-ends MAY try to backfill appropriate
``Programming Language :: Python`` `trove classifiers`_ based on what
the user specified for this field.

- Format: Table
- `Core metadata`_: ``License``
  (`link <>`__)
- Source distributions: cannot by dynamic
- Synonyms

  - Flit_: ``license``
    (`link <>`__)
  - Poetry_: ``license``
    (`link <>`__)
  - Setuptools_: ``license``, ``license_file``, ``license_files``
    (`link <>`__)

The table may have one of two keys. The ``file`` key has a string
value that is a relative file path to the file which contains the
license for the project. Tools MUST assume the file's encoding is
UTF-8. The ``text`` key has a string value which is the license of the
project.  These keys are mutually exclusive, so a tool MUST raise an
error if the metadata specifies both keys.

A practical string value for the ``license`` key has been purposefully
left out to allow for a future PEP to specify support for SPDX_
expressions (the same logic applies to any sort of "type" field
specifying what license the ``file`` or ``text`` represents). If such
support comes to fruition and a tool can unambiguously identify the
license specified, then the tool MAY fill in the appropriate trove

- Format: Array of inline tables with string keys and values
- `Core metadata`_: ``Author``/``Author-email``/``Maintainer``/``Maintainer-email``
  (`link <>`__)
- Source distributions: cannot by dynamic
- Synonyms

  - Flit_: ``author``/``author-email``/``maintainer``/``maintainer-email``
    (`link <>`__)
  - Poetry_: ``authors``/``maintainers``
    (`link <>`__)
  - Setuptools_: ``author``/``author_email``/``maintainer``/``maintainer_email``
    (`link <>`__)

The people or organizations considered to be the "authors" of the
project. The exact meaning is open to interpretation — it may list the
original or primary authors, current maintainers, or owners of the

The "maintainers" field is similar to "authors" in that its exact
meaning is open to interpretation.

These fields accept an array of tables with 2 keys: ``name`` and
``email``. Both values must be strings. The ``name`` value MUST be a
valid email name (i.e. whatever can be put as a name, before an email,
in `RFC #822`_) and not contain commas. The ``email`` value MUST be a
valid email address. Both keys are optional.

Using the data to fill in `core metadata`_ is as follows:

1. If only ``name`` is provided, the value goes in
   ``Author``/``Maintainer`` as appropriate.
2. If only ``email`` is provided, the value goes in
   ``Author-email``/``Maintainer-email`` as appropriate.
3. If both ``email`` and ``name`` are provided, the value goes in
   ``Author-email``/``Maintainer-email`` as appropriate, with the
   format ``{name} <{email}>``.

- Format: array of strings
- `Core metadata`_: ``Keywords``
  (`link <>`__)
- Source distributions: cannot by dynamic
- Synonyms

  - Flit_: ``keywords``
    (`link <>`__)
  - Poetry_: ``keywords``
    (`link <>`_)
  - Setuptools_: ``keywords``
    (`link <>`__)

The keywords for the project.

- Format: array of strings
- `Core metadata`_: ``Classifier``
  (`link <>`__)
- Source distributions: cannot by dynamic
- Synonyms

  - Flit_: ``classifiers``
    (`link <>`__)
  - Poetry_: ``classifiers``
    (`link <>`__)
  - Setuptools_: ``classifiers``
    (`link <>`__)

`Trove classifiers`_ which apply to the project.

Build back-ends MAY automatically fill in extra trove classifiers
if the back-end can deduce the classifiers from the provided metadata.

- Format: Table, with keys and values of strings
- `Core metadata`_: ``Project-URL``
  (`link <>`__)
- Source distributions: cannot by dynamic
- Synonyms

  - Flit_: ``[tool.flit.metadata.urls]`` table
    (`link <>`__)
  - Poetry_: ``[tool.poetry.urls]`` table
    (`link <>`__)
  - Setuptools_: ``project_urls``
    (`link <>`__)

A table of URLs where the key is the URL label and the value is the
URL itself.

Entry points
- Format: Table (``[project.scripts]``, ``[project.gui-scripts]``, and
- `Core metadata`_: N/A;
  `Entry point specification <>`_
- Source distributions: optional
- Synonyms

  - Flit_: ``[tool.flit.scripts]`` table for console scripts,
    ``[tool.flit.entrypoints]`` for the rest
    (`link <>`__)
  - Poetry_: ``[tool.poetry.scripts]`` table for console scripts
    (`link <>`__)
  - Setuptools_: ``entry_points``
    (`link <>`__)

There are three tables related to entry points. The
``[project.scripts]`` table corresponds to the ``console_scripts``
group in the `core metadata`_. The key of the table is the name of the
entry point and the value is the object reference.

The ``[project.gui-scripts]`` table corresponds to the ``gui_scripts``
group in the `core metadata`_. Its format is the same as

The ``[project.entry-points]`` table is a collection of tables. Each
sub-table's name is an entry point group. The key and value semantics
are the same as ``[project.scripts]``. Users MUST NOT create
nested sub-tables but instead keep the entry point groups to only one
level deep.

Build back-ends MUST raise an error if the metadata defines a
``[project.entry-points.console_scripts]`` or
``[project.entry-points.gui_scripts]`` table, as they would
be ambiguous in the face of ``[project.scripts]`` and
``[project.gui-scripts]``, respectively.

- Format: Array of :pep:`508` strings (``dependencies``) and a table
  with values of arrays of :pep:`508` strings
- `Core metadata`_: ``Requires-Dist`` and ``Provides-Extra``
  (`link <>`__,
  `link <>`__)
- Source distributions: optional
- Synonyms

  - Flit_: ``requires`` for required dependencies, ``requires-extra``
    for optional dependencies
    (`link <>`__)
  - Poetry_: ``[tool.poetry.dependencies]`` for dependencies (both
    required and for development),
    ``[tool.poetry.extras]`` for optional dependencies
    (`link <>`__)
  - Setuptools_: ``install_requires`` for required dependencies,
    ``extras_require`` for optional dependencies
    (`link <>`__)

The (optional) dependencies of the project.

For ``dependencies``, it is a key whose value is an array of strings.
Each string represents a dependency of the project and MUST be
formatted as a valid :pep:`508` string. Each string maps directly to
a ``Requires-Dist`` entry in the `core metadata`_.

For ``optional-dependencies``, it is a table where each key specifies
an extra and whose value is an array of strings. The strings of the
arrays must be valid :pep:`508` strings. The keys MUST be valid values
for the ``Provides-Extra`` `core metadata`_. Each value in the array
thus becomes a corresponding ``Requires-Dist`` entry for the matching
``Provides-Extra`` metadata.

- Format: Array of strings
- `Core metadata`_: N/A
- Source distributions: optional
- No synonyms

Specifies which fields listed by this PEP were intentionally
unspecified so another tool can/will provide such metadata
dynamically. This clearly delineates which metadata is purposefully
unspecified and expected to stay unspecified compared to being
provided via tooling later on.

- A build back-end MUST honour statically-specified metadata (which
  means the metadata did not list the field in ``dynamic``).
- A build back-end MUST raise an error if the metadata specifies the
  ``name`` in ``dynamic``.
- If the `core metadata`_ specification lists a field as "Required",
  then the metadata MUST specify the field statically or list it in
  ``dynamic`` (build back-ends MUST raise an error otherwise, i.e. it
  should not be possible for a required field to not be listed somehow
  in the ``[project]`` table).
- If the `core metadata`_ specification lists a field as "Optional",
  the metadata MAY list it in ``dynamic`` if the expectation is a
  build back-end will provide the data for the field later.
- Build back-ends MUST raise an error if the metadata specifies a
  field statically as well as being listed in ``dynamic``.
- If the metadata does not list a field in ``dynamic``, then a build
  back-end CANNOT fill in the requisite metadata on behalf of the user
  (i.e. ``dynamic`` is the only way to allow a tool to fill in
  metadata and the user must opt into the filling in).
- Build back-ends MUST raise an error if the metadata specifies a
  field in ``dynamic`` but is still unspecified in the final artifact
  (i.e. the build back-end was unable to provide the data for a field
  listed in ``dynamic``).


  name = "spam"
  version = "2020.0.0"
  description = "Lovely Spam! Wonderful Spam!"
  readme = "README.rst"
  requires-python = ">=3.8"
  license = {file = "LICENSE.txt"}
  keywords = ["egg", "bacon", "sausage", "tomatoes", "Lobster Thermidor"]
  authors = [
    {email = ""},
    {name = "Tzu-Ping Chung"}
  maintainers = [
    {name = "Brett Cannon", email = ""}
  classifiers = [
    "Development Status :: 4 - Beta",
    "Programming Language :: Python"

  dependencies = [
    "django>2.1; os_name != 'nt'",
    "django>2.0; os_name == 'nt'"

  test = [
    "pytest < 5.0.0",

  homepage = ""
  documentation = ""
  repository = ""
  changelog = ""

  spam-cli = "spam:main_cli"

  spam-gui = "spam:main_gui"

  tomatoes = "spam:main_tomatoes"

Backwards Compatibility

As this provides a new way to specify a project's `core metadata`_ and
is using a new table name which falls under the reserved namespace as
outlined in :pep:`518`, there are no backwards-compatibility concerns.

Security Implications

There are no direct security concerns as this PEP covers how to
statically define project metadata. Any security issues would stem
from how tools consume the metadata and choose to act upon it.

How to Teach This

[How to teach users, new and experienced, how to apply the PEP to their work.]

Reference Implementation

There are currently no proofs-of-concept from any build tools
implementing this PEP.

Rejected Ideas

Other table names

Anything under ``[build-system]``
There was worry that using this table name would exacerbate confusion
between build metadata and project metadata, e.g. by using
``[build-system.metadata]`` as a table.

Garnered no strong support.

The strongest contender after ``[project]``, but in the end it was
agreed that ``[project]`` read better for certain sub-tables, e.g.

Support for a metadata provider
Initially there was a proposal to add a middle layer between the
static metadata specified by this PEP and
``prepare_metadata_for_build_wheel()`` as specified by :pep:`517`. The
idea was that if a project wanted to insert itself between a build
back-end and the metadata there would be a hook to do so.

In the end the authors considered this idea unnecessarily complicated
and would move the PEP away from its design goal to push people to
define core metadata statically as much as possible.

Require a normalized project name
While it would make things easier for tools to only work with the
normalized name as specified in :pep:`503`, the idea was ultimately
rejected as it would hurt projects transitioning to using this PEP.

Specify files to include when building
The authors decided fairly quickly during design discussions that
this PEP should focus exclusively on project metadata and not build
metadata. As such, specifying what files should end up in a source
distribution or wheel file is out of scope for this PEP.

Name the ``[project.urls]`` table ``[project.project-urls]``
This suggestion came thanks to the corresponding `core metadata`_
being ``Project-Url``. But once the overall table name of ``[project]``
was chosen, the redundant use of the word "project" suggested the
current, shorter name was a better fit.

Have a separate ``url``/``home-page`` field
While the `core metadata`_ supports it, having a single field for a
project's URL while also supporting a full table seemed redundant and

Recommend that tools put development-related dependencies into a "dev" extra
As various tools have grown the concept of required dependencies
versus development dependencies, the idea of suggesting to tools that
they put such development tool into a "dev" grouping came up. In the
end, though, the authors deemed it out-of-scope for this specification
to suggest such a workflow.

Have the ``dynamic`` field only require specifying missing required fields
The authors considered the idea that the ``dynamic`` field would only
require the listing of missing required fields and make listing
optional fields optional. In the end, though, this went against the
design goal of promoting specifying as much information statically as

Different structures for the ``readme`` field
The ``readme`` field had a proposed ``readme_content_type`` field, but
the authors considered the string/table hybrid more practical for the
common case while still accommodating the more complex case. Same goes
for using ``long_description`` and a corresponding
``long_description_content_type`` field.

The ``file`` key in the table format was originally proposed as
``path``, but ``file`` corresponds to setuptools' ``file`` key and
there is no strong reason otherwise to choose one over the other.

Allowing the ``readme`` field to imply ``text/plain``
The authors considered allowing for unspecified content-types which
would default to ``text/plain``, but decided that it would be best to
be explicit in this case to prevent accidental incorrect renderings on
PyPI and to force users to be clear in their intent.

Other names for ``dependencies``/``optional-dependencies``
The authors originally proposed ``requires``/``extra-requires`` as
names, but decided to go with the current names after a survey of
other packaging ecosystems showed Python was an outlier:

1. `npm <>`__
2. `Rust <>`__
3. `Dart <>`__
4. `Swift <>`__
5. `Ruby <>`__

Normalizing on the current names helps minimize confusion for people coming from
other ecosystems without using terminology that is necessarily foreign to new
programmers. It also prevents potential confusion with ``requires`` in the
``[build-system]`` table as specified in :pep:`518`.

Support ``Maintainers``/``Maintainers-email``
When discussing how to support ``Authors``/``Authors-email``, the question was
brought up as to how exactly authors differed from maintainers. As this was
never clearly defined and no one could come up with a good definition, the
decision was made to drop the concept of maintainers.

Drop ``maintainers`` to unify with ``authors``
As the difference between ``Authors`` and ``Maintainers`` fields in
the `core metadata`_ is unspecified and ambiguous, this PEP originally
proposed unifying them as a single ``authors`` field. Other ecosystems
have selected "author" as the term to use, so the thinking was to
standardize on ``Author`` in the core metadata as the place to list
people maintaining a project.

In the end, though, the decision to adhere to the core metadata was
deemed more important to help with the the acceptance of this PEP,
rather than trying to introduce a new interpretation for some of the
core metadata.

Support an arbitrary depth of tables for ``project.entry-points``
There was a worry that keeping ``project.entry-points`` to a depth of 1 for sub-tables
would cause confusion to users if they use a dotted name and are not used to table
names using quotation marks (e.g. ``project.entry-points."spam.magical"``). But
supporting an arbitrary depth -- e.g. ``project.entry-points.spam.magical`` -- would
preclude any form of an exploded table format in the future. It would also complicate
things for build back-ends as they would have to make sure to traverse the full
table structure rather than a single level and raising errors as appropriate on
value types.

Backfilling trove classifiers SHOULD occur instead of MAY happen
Originally this PEP said that tools SHOULD backfill appropriate trove classifiers.
This was changed to say it MAY occur to emphasize it was entirely optional for
build back-ends to implement.

Using structured TOML dictionaries to specify dependencies
The format for specifying the dependencies of a project was the most
hotly contested topic in terms of data format. It led to the creation
of both :pep:`631` and :pep:`633` which represent what is in this PEP
and using TOML dictionaries more extensively, respectively. The
decision on those PEPs can be found at

The authors briefly considered supporting both formats, but decided
that it would lead to confusion as people would need to be familiar
with two formats instead of just one.

Open Issues
None at the moment.


This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.

.. _PyPI:
.. _core metadata:
.. _flit:
.. _poetry:
.. _setuptools:
.. _setuptools metadata:
.. _survey of tools:
.. _trove classifiers:
.. _SPDX:
.. _RFC #822:

   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8

I quite like the updates made since the last iteration, especially the ones making the guarantees around the [project] table stronger, allowing for reliable usage of the information there.

I reckon this is ready to move forward toward acceptance and all the good stuff, assuming no one is opposed to those changes. :slight_smile:


The latest version of the PEP feels like it has morphed into a backdoor "standardizing sdist metadata in pyproject.toml", which I think is a bad idea. It is not mentioned outright, but the way I’m reading it, it seems to require that build backends modify the pyproject.toml file at build time before their inclusion in the source distribution. For example:

Wasn’t version the motivating field for the dynamic field, so that setuptools-scm users could specify "pull it from git"?

Why are build-backends “providing” anything in an sdist of this sort, if not by modifying pyproject.toml? Surely they include the pyproject.toml in the source distribution since it is necessary to do a modern build and any backend other than setuptools will be required to include it in order for the build to work at all. In my opinion, the proper role of the backend is validating the non-dynamic metadata and then translating it into the core metadata for PKG-INFO or METADATA or whatever.

This again seems like the wrong thing, because it would require build backends to make at least some dynamic metadata static as part of the sdist creation process.

I think that this bullet point (and all the other text that seems to indicate that we’re looking to create a standardized store of metadata) should be removed and replaced with something that indicates that any data specified statically in the pyproject.toml is canonical and can be relied upon, and leave anything about “known at the time of source distribution creation” to a later PEP standardizing source distributions.

It’s still useful for people to know anything specified in PEP 621 fields must be canonical even in a world where source distributions have standardized metadata — it would allow things that operate directly on repos and not on source distributions to work confidently in the “easy path” situation where metadata is specified statically.


A separate reply because it’s at least partially unrelated to my other criticism:

Assuming we indeed want to specify that some fields must be static in the source distribution (whether through the idea I criticized in my earlier post where backends modify pyproject.toml or by requiring them to be static at input), we should list those fields. I don’t know which fields PyPI exposes offhand, and if it ever changes in the future that would create uncertainty about whether the rule applies to a potentially evolving set of fields, or just the ones that were exposed in PyPI at the time that the PEP was passed.

I don’t think it’s a “backdoor” thing as the PEP now says it outright. Plus @pf_moore specifically likes/wants it. :grin:

Potentially, yes. If that’s not clear I’m happy to try and clarify the wording if you have suggestions.

But we don’t have that it in sdists right now either. This is one of those things where I think getting consensus may be hard since everyone is going to have a personal preference with no outright winner, especially as there isn’t clear precedent for metadata in sdists.

Why is that wrong specifically? If you look at the fields listed for being static you will see they either already static in order to make the sdist file name or are static because they are necessary to communicate up to PyPI for uploading:

  1. name is necessary for the file name
  2. version is necessary for the file name
  3. description goes to PyPI
  4. readme goes to PyPI
  5. requires-python goes to PyPI
  6. license being dynamic seems rather odd for an sdist since you wouldn’t know what the license is otherwise
  7. authors and maintainers goes to PyPI
  8. keywords goes to PyPI
  9. classifiers goes to PyPI
  10. urls goes to PyPI

Those are the only fields with guidance on what to place in sdists and tools already have this information anyway when making an sdist for one of two reasons as mentioned above.

So is the concern more about having sdist build tools write or update pyproject.toml, or am I missing something more subtle about having these specific fields be static in an sdist?

If other people object we can entertain doing that, but I’m still going to push for the same thing and we have not exactly made headway on the sdist conversations, so I’m not sure what pushing that off does. To me, an sdist is a post-source artifact, not a pre-wheel artifact and that suggests to me keeping the metadata in the more human-readable format while also making it easier for wheel-buliding tools to have to potentially do a little less by having more data provided upfront that tools already calculated once before.

Just the ones that are in the PEP. Uploading any more details would mean new core metadata fields anyway which is a whole standards process on its own.

Note in particular that even if we do ultimately standardise a different file (say METADATA) as the sdist metadata file, it will still be required that pyproject.toml and that file will have to contain the same data, as there’s no way it makes sense to have two files with different data. So what’s the harm in fixing the data now rather than later? As @brettcannon says, it’s all data that has to be known by the time we write a sdist anyway, so we’re not blocking any sort of actual flexibility, we’re just recording data that is otherwise currently getting lost in a non-standardised location.

And to go back to the point about this being a “backdoor”, I certainly don’t intend it to be unclear that we expect pyproject.toml in sdists to contain all of the data that is fixed, not just whatever the user specified in the original source tree. To the contrary, I consider with its previous wording, which basically said that the metadata shouldn’t be considered as canonical, the PEP didn’t offer sufficient benefits to be worth standardising, so I would have rejected it in that form anyway.

I don’t strongly object to the changes, but agree with Paul that this was extremely unexpected.

My concerns:

  1. We should try to immediately get TOML support in the standard library. Until then, we should recommend a library so installs don’t pull in more than one of toml, tomlkit, etc.

  2. We should be much more explicit regarding the new sdist requirements. For example, backends would have to remove the dynamic field after each entry is made static, right?

  3. That is contrary to what most people say here: Purpose of an sdist

    Question: does pip treat an sdist as something that is installed directly or does it build a wheel from it first? It has been a while since I looked at the code.

All the more reason to not go from a PEP where we have consensus (how to specify metadata for build backends) to a PEP where we we are very far from consensus (how to standardize sdists).

Well, for one thing, I really don’t like that we’re requiring backends to not only read but also emit TOML files now. Must we preserve comments and whitespace? Is there a facility for doing such a thing?

We’re also conflating inputs (pyproject.toml) with outputs (METADATA), and creating two different ways to specify the static metadata.

And this is and must be opt-in, so it’s not even a standardized place to look for metadata!

If we want standardized sdists, we should work on the PEP for standardized sdists. Many of our discussions to this point would have gone dramatically differently, in my opinion, if we had been designing a standardized place for build backends to store metadata.

Fine, we should reject the PEP then. It’s a bad design for a standardized mechanism for storing static metadata, and if the benefits as an input system aren’t good enough for it to be worth standardizing, we should just reject it.

1 Like

That is contrary to what most people say here: Purpose of an sdist Question: does pip treat an sdist as something that is installed directly or does it build a wheel from it first? It has been a while since I looked at the code.

To reply to your points:

  1. That’s a separate issue. TOML has been used in packaging since PEP 518, so there’s nothing new here.
  2. The rules don’t change. Removing a field from dynamic is how you make it static. I don’t understand what you think is implicit here?
  3. I don’t really understand “post-source” vs “pre-wheel”. As far as I am concerned, sdists are built artifacts (created by backends from source trees) that are used to build wheels. The standards offer no way to install a sdist directly, so the only standards-compliant way of installing from a sdist is via a wheel. Pip does have legacy code that uses setuptools-specific mechanisms to install sdists directly, but (a) that’s backend-specific for setuptools, and (b) we are phasing that out.

That’s a fair comment - this is a new process. But I’m not sure it’s as big an issue as you’re suggesting. I’m happy to debate the implications, but I don’t see this as an immediate showstopper.

If the consensus is that the changes @brettcannon has made are unacceptable, and we go back to the previous version, then yes, the PEP will get rejected. But I don’t think we have consensus yet.

I agree with @pganssle that it feels wrong to require the backend to modify pyproject.toml. I’m not against the backend needing to fill out metadata (I think it is a good idea for the reasons mentioned above, e.g. avoid inconsistencies between METADATA and pyproject.toml), but TOML is not a particular good format to modify in-place for later user consumption, at least with tools currently available in Python. Existing libraries tend to discard user formatting (the best available is TOMLKit, but it has many other problems). This would be problematic since pyproject.toml is ultimately a user-facing file, and users would be unhappy if they crack open an sdist for development and find pyproject.toml garbled.

Would it be better if instead of back-filling the information in-place, we make backends fill them into in a separate table? This can be easier to do without needing to rewrite hand-written data (e.g. serialise the table separately and append the string at the end of the file).

If we were to split out the "backends must update pyproject.toml" question for a moment and just focus on question of what the proposal states must be canonical and fixed in a sdist, then is there any problem with the list? Because we’re going to get blocked on the “sdist standardisation” debate again if we’re not careful here, and I’d like to avoid that. On the other hand, one of my biggest complaints with the original PEP 621 is that it specified so little as required that it was essentially of no use to general consumers.

Another option would be to make all of the fields (with the exception of version) that round 3 states are required post-sdist not allowed to be dynamic in all situations. Version is a special case, and sdist consumers get that from the filename anyway, so I’m OK with treating that differently.

Attempting to standardise sdists should trigger that debate :slight_smile:

If this proposal needs sdists to contain certain metadata beyond what is provided by the original source repository, then it’s trying to standardise sdists.

1 Like

This is exactly what has happened with the expansion of the PEP into a mechanism for standardizing the metadata in an sdist. We in fact had basically this same discussion here.

I think this idea is too much in that it changes the nature of pyproject.toml in an undesirable way and it is not enough, in that it’s not going to be a very effective way of speeding anything up or improving the way tools can scrape metadata from the ecosystem. Even if we pass this as is, today, I don’t think a significant fraction of the ecosystem will have adopted it within 2 years. I imagine adoption will be particularly slow among big and fundamental projects that everyone depends on — which are likely to be more conservative with their build systems for various reasons. If we were to go with a “standardize the status quo” approach, we could upgrade huge swathes of the ecosystem overnight by implementing standardized sdists in setuptools.

Also, to circle back to the rejection reason:

To be clear, my suggestion was that PEP 621 data is canonical, when specified. The difference would be that pyproject.toml would remain purely an input file, and tools that need to access dynamic fields from pyproject.toml would still need to be resolved by a call prepare_metadata_for_build_wheel.

Is this something other backend authors agree on? I think the main benefit of something like this is that you can have a unified tutorial for how to specify a lot of what goes into your package, but I suppose it’s fair to say that since this doesn’t specify enough to actually build a package (doesn’t specify the contents), then it wouldn’t generate terribly useful tutorials (other than possibly a nice “how to specify dependencies” tutorial, which could have a decent amount of meat in it). The other benefit is that it allows people to write a standards-compliant PEP 621 → METADATA library for re-use by backend authors (though of course someone can just write one anyway and then “whatever that library does” becomes the standard).

If we’re pivoting, I say we pivot to an informational PEP intended to give backend authors a ready-made description of how to design a way to specify metadata in pyproject.toml.

I’m not sure what you’re referring to here? That the previous version of the PEP didn’t offer sufficient benefits¹? When I was talking about benefits I was referring to consumers other than backends. Obviously such consumers can’t assume PEP 621, but if it is there, I’d rather it contained as much information as possible, otherwise what’s the point? If you’re focusing on backend authors, then that’s a different perspective - what are you seeing as benefits? @sdispater seemed fairly clear that (now we have PEP 508 strings for dependencies) Poetry won’t be adopting PEP 621 in the near term, so I doubt Poetry is relevant. Flit already has a syntax quite similar to PEP 621, and nowhere except pyproject.toml to get data from, so the only real change there is the section name. So who apart from setuptools are you looking at for agreement?

Maybe I’m making the wrong assumptions about how setuptools would adopt this. I’m presuming that it will be an optional alternative to setup.cfg/ for a long time yet, simply for compatibility reasons. So there’s not going to be much to push users to switch. So I expect a significant majority of user code won’t have PEP 621 metadata for a long time yet. If setuptools just copies the existing files over to the sdist, then that’s true for sdists too. If setuptools writes PEP 621 data into sdists, we get a significant, and much faster, migration of sdists to have reliable PEP 621 metadata right from the start. To me, that’s a massive benefit for people writing tools to introspect sdists.

What am I missing?

An informational PEP would require each backend to use its own [] namespace, and would just define the format. That has even less benefit for non-backend consumers. I’m inclined to simply say let @brettcannon post the proposal somewhere, then setuptools and flit can adopt it and no need for any sort of PEP. We can save the PEP process for when/if we want to use the reserved namespace and make it a formal standard rather than an informational one.

¹ Edit: On re-reading, maybe you’re asking whether the new wording in PEP 621 is something other backend authors agree on. In which case the answer is we’re still discussing it, so I don’t know yet. But in that case I’m certainly not just looking for opinions from backend authors, I’m asking all interested parties, including people wanting to introspect sdists.

1 Like

Is there a downside to accepting the PEP as it was previously and drafting a new PEP for sdist 2.0? We need to agree on the name too anyway: PEP 625: File name of a Source Distribution

As someone closely following all of these threads for my Hatch rewrite, it’s super surprising that this PEP went from what appeared to be consensus to blocked.

1 Like

I believe this answers my question — if the other backend authors who participated in the process don’t find PEP 621 useful, then it should be dead in the water.

Yes, that is the benefit I am talking about — if we standardize where metadata goes in an sdist, setuptools can do it quietly, which is why I was excited about standardizing sdist metadata. We should definitely have a conversation about the best way to standardize sdist metadata and come up with an approach that will work well so that we can realize this benefit.

I think I’m relatively convinced that we should withdraw PEP 621. IMO it will still be useful because it’s a pretty good design for a static metadata spec, and people can adopt it whether or not it is accepted.

I find it very useful at least :slightly_frowning_face: . I’m sure setuptools and flit would adopt it shortly too.

As-is or under [tool.<NAME>]?

It’s the backend authors who have to adapt, but other consumers (particularly PyPI) who get the benefit.

As a (part-time) backend author, I’d prefer the sdist standard to be based on the wheel metadata than the user-written metadata. That way I don’t need to deal with two different output formats (but I also deliberately designed my backend to treat sdist as just a partially in-place compiled source directory, including rewriting the pyproject.toml completely, so there’s not a lot that happens in the sdist->wheel step).

But provided I’m using a library to read/validate/write the pyproject.toml file, it’s no big deal. I’d rather not have to encode all of the transform logic between PEP 621 and METADATA though.

1 Like

What is it you find useful about it, though? I was always dubious about the prospect that this is solving a major problem. Because this doesn’t specify how a package is built, it’s not like this makes your pyproject.toml file interoperable between different backends.

If you just adopt PEP 621 under tool.hatch or tool.hatch:project (to answer your question) you get basically all the benefits. We can even write a library that parses these things directly into some sort of intermediate object that is capable of writing METADATA files (you tell it what the root table is and it just works).

If we assume that, because it only covers metadata, the benefits for documentation, project templating and backend switching are marginal (fair), it seems that the only thing backends would be getting out of this would be the ability to put their metadata in the [project] table rather than a tool-specific table, which is not such a big deal.

There is one place where I could see the idea of standardizing metadata being useful, though, which is for tools that seek to scrape metadata directly from repositories rather than source distributions. E.g. dependabot or or whatever. Even in a world where source distributions are standardized, that would allow tools like that to avoid unnecessary sdist builds when the project they are analyzing uses PEP 621. Without PEP 621, such tools would either need to always execute builds or just special-case tool.setuptools (and maybe add a parser for poetry and maybe flit), and not be compatible with more marginal backends.

I don’t know that we ever got any input from someone building tools like this, and I don’t know how much of an important use case they are.

I mostly like:

  1. the familiarity of fields granted by the future network effect of widespread use, allowing for a mostly interoperable pyproject.toml file. similar to [requests|httpx].[get|post|...]
  2. quite simply, I like using standards. it reduces uncertainty and avoids having to re-invent the wheel, whether that be naming, implementation, etc.
  3. as you mention, a default place to look for dependency scanners
1 Like