PEP 621: Storing project metadata in pyproject.toml

Some of us have been working behind the scenes to try and come up with a standard on how to specify project metadata. For me, more static metadata is good (although there is an escape hatch for those that want to dynamically calculate something like a version number), and having to learn how to specify the same semantic thing for each build tool is unnecessarily redundant.

Do note that there is an open issue on how to specify dependencies as we couldn’t reach consensus on which format to go with. Otherwise we managed to reach agreement on everything else that you would want to typically specify for a wheel.

The rendered version of the PEP can be found at https://www.python.org/dev/peps/pep-0621/.


PEP: 621
Title: Storing project metadata in pyproject.toml
Author: Brett Cannon <brett@python.org>,
        Dustin Ingram <di@python.org>,
        Paul Ganssle <paul at ganssle.io>,
        Paul Moore <p.f.moore@gmail.com>,
        Pradyun Gedam <pradyunsg@gmail.com>,
        Sébastien Eustace <sebastien@eustace.io>,
        Thomas Kluyver <thomas@kluyver.me.uk>,
        Tzu-Ping Chung <uranusjr@gmail.com>
Discussions-To: https://discuss.python.org/t/pep-621-storing-project-metadata-in-pyproject-toml/4513
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 22-Jun-2020
Post-History: 22-Jun-2020


Abstract
========

This PEP specifies how to write a project's `core metadata`_ in a
``pyproject.toml`` file for packaging-related tools to consume.


Motivation
==========

The key motivators of this PEP are:

- Encourage users to specify core metadata statically for speed,
  ease of specification, deterministic consumption by build back-ends,
  and ease analysis of source checkouts
- Provide a tool-agnostic way of specifying the metadata for ease of
  learning and transitioning between build back-ends
- Allow for more code sharing between build back-ends for the
  "boring parts" of a project's metadata

This PEP does **not** attempt to standardize all possible metadata
required by a build back-end, only the metadata covered by the
`core metadata`_ specification which are very common across projects
and would stand to benefit from being static and consistently
specified. This means build back-ends are still free and able to
innovate around patterns like how to specify the files to include in a
wheel. There is also an included escape hatch for users and build
back-ends to use when they choose to partially opt-out of this PEP
(compared to opting-out of this PEP entirely, which is also possible).

This PEP is also not trying to change the underlying `core metadata`_
in any way. Such considerations should be done in a separate PEP which
may lead to changes or additions to what this PEP specifies.

Finally, this PEP is meant for users to specify metadata for build
back-ends or those doing analysis on a source checkout. Once a build
back-end has produced an artifact, then the metadata
contained in the artifact that the build back-end produced should be
considered canonical and overriding what this PEP specifies. In the
eyes of this PEP, a source distribution is considered a build
artifact, thus people should not read the metadata specified in this
PEP as the canonical metadata in a source distribution.


Rationale
=========

The design guidelines the authors of this PEP followed were:

- Define as much of the `core metadata`_ as reasonable
- Define the metadata statically with an escape hatch for those who
  want to define it dynamically
- Use familiar names where it makes sense, but be willing to use more
  modern terminology
- Try to be ergonomic within a TOML file instead of mirroring how
  tools specify metadata at a low-level
- Learn from other build back-ends in the packaging ecosystem which
  have used TOML for their metadata
- Don't try to standardize things which lack a pre-existing standard
  at a lower-level
- *When* metadata is specified using this PEP then it is considered
  canonical, but that any and all metadata can be considered
  *optional* (`core metadata`_ has its own requirements of what
  data must be provided *somehow*)


Specification
=============

When specifying project metadata, tools MUST adhere and honour the
metadata as specified in this PEP. If metadata is improperly specified
then tools MUST raise an error to notify the user about their mistake.

Details
-------

Table name
''''''''''

Tools MUST specify fields defined by this PEP in a table named
``[project]``. No tools may add fields to this table which are not
defined by this PEP. For tools wishing to store their own settings in
``pyproject.toml``, they may use the ``[tool]`` table as defined in
:pep:`518`. The lack of a ``[project]`` table implicitly means the
build back-end will dynamically provide all fields.

``name``
''''''''
- Format: string
- `Core metadata`_: ``Name``
  (`link <https://packaging.python.org/specifications/core-metadata/#name>`__)
- Synonyms

  - Flit_: ``module``/``dist-name``
    (`link <https://flit.readthedocs.io/en/latest/pyproject_toml.html#metadata-section>`__)
  - Poetry_: ``name``
    (`link <https://python-poetry.org/docs/pyproject/#name>`__)
  - Setuptools_: ``name``
    (`link <https://setuptools.readthedocs.io/en/latest/setuptools.html#metadata>`__)

The name of the project.

Tools MUST require users to statically define this field.

Tools SHOULD normalize this name, as specified by :pep:`503`, as soon
as it is read for internal consistency.

``version``
'''''''''''
- Format: string
- `Core metadata`_: ``Version``
  (`link <https://packaging.python.org/specifications/core-metadata/#version>`__)
- Synonyms

  - Flit_: N/A (read from a ``__version__`` attribute)
    (`link <https://flit.readthedocs.io/en/latest/index.html#usage>`__)
  - Poetry_: ``version``
    (`link <https://python-poetry.org/docs/pyproject/#version>`__)
  - Setuptools_: ``version``
    (`link <https://setuptools.readthedocs.io/en/latest/setuptools.html#metadata>`__)

The version of the project as supported by :pep:`440`.

Users SHOULD prefer to specify already-normalized versions.

``description``
'''''''''''''''
- Format: string
- `Core metadata`_: ``Summary``
  (`link <https://packaging.python.org/specifications/core-metadata/#summary>`__)
- Synonyms

  - Flit_: N/A
  - Poetry_: ``description``
    (`link <https://python-poetry.org/docs/pyproject/#description>`__)
  - Setuptools_: ``description``
    (`link <https://setuptools.readthedocs.io/en/latest/setuptools.html#metadata>`__)

The summary description of the project.

``readme``
''''''''''
- Format: String or table
- `Core metadata`_: ``Description``
  (`link <https://packaging.python.org/specifications/core-metadata/#description>`__)
- Synonyms

  - Flit_: ``description-file``
    (`link <https://flit.readthedocs.io/en/latest/pyproject_toml.html#metadata-section>`__)
  - Poetry_: ``readme``
    (`link <https://python-poetry.org/docs/pyproject/#readme>`__)
  - Setuptools_: ``long_description``
    (`link <https://setuptools.readthedocs.io/en/latest/setuptools.html#metadata>`__)

The full description of the project (i.e. the README).

The field accepts either a string or a table. If it is a string then
it is the relative path to a text file containing the full
description. Tools MUST assume the file's encoding as UTF-8. If the
file path ends in a case-insensitive ``.md`` suffix, then tools MUST
assume the content-type is ``text/markdown``. If the file path ends in
a case-insensitive ``.rst``, then tools MUST assume the content-type
is ``text/x-rst``. If a tool recognizes more extensions than this PEP,
they MAY infer the content-type for the user without specifying this
field as ``dynamic``. For all unrecognized suffixes when a
content-type is not provided, tools MUST raise an error.

The ``readme`` field may also take a table. The ``file`` key has a
string value representing a relative path to a file containing the
full description. The ``text`` key has a string value which is the
full description. These keys are mutually-exclusive, thus tools MUST
raise an error if the metadata specifies both keys.

The table also has a ``content-type`` field which takes a string
specifying the content-type of the full description. A tool MUST raise
an error if the metadata does not specify this field in the table. If
the metadata does not specify the ``charset`` parameter, then it is
assumed to be UTF-8. Tools MAY support other encodings if they choose
to. Tools MAY support alternative content-types which they can
transform to a content-type as supported by the `core metadata`_.
Otherwise tools MUST raise an error for unsupported content-types.

``requires-python``
'''''''''''''''''''
- Format: string
- `Core metadata`_: ``Requires-Python``
  (`link <https://packaging.python.org/specifications/core-metadata/#summary>`__)
- Synonyms

  - Flit_: ``requires-python``
    (`link <https://flit.readthedocs.io/en/latest/pyproject_toml.html#metadata-section>`__)
  - Poetry_: As a ``python`` dependency in the
    ``[tool.poetry.dependencies]`` table
    (`link <https://python-poetry.org/docs/pyproject/#dependencies-and-dev-dependencies>`__)
  - Setuptools_: ``python_requires``
    (`link <https://setuptools.readthedocs.io/en/latest/setuptools.html#metadata>`__)

The Python version requirements of the project.

Build back-ends MAY try to backfill appropriate
``Programming Language :: Python`` `trove classifiers`_ based on what
the user specified for this field.

``license``
'''''''''''
- Format: Table
- `Core metadata`_: ``License``
  (`link <https://packaging.python.org/specifications/core-metadata/#license>`__)
- Synonyms

  - Flit_: ``license``
    (`link <https://flit.readthedocs.io/en/latest/pyproject_toml.html#metadata-section>`__)
  - Poetry_: ``license``
    (`link <https://python-poetry.org/docs/pyproject/#license>`__)
  - Setuptools_: ``license``, ``license_file``, ``license_files``
    (`link <https://setuptools.readthedocs.io/en/latest/setuptools.html#metadata>`__)

The table may have one of two keys. The ``file`` key has a string
value that is a relative file path to the file which contains the
license for the project. Tools MUST assume the file's encoding is
UTF-8. The ``text`` key has a string value which is the license of the
project.  These keys are mutually exclusive, so a tool MUST raise an
error if the metadata specifies both keys.

A practical string value for the ``license`` key has been purposefully
left out to allow for a future PEP to specify support for SPDX_
expressions. If such support comes to fruition and a tool can
unambiguously identify the license specified, then the tool MAY
fill in the appropriate trove classifier.

``authors``/``maintainers``
'''''''''''''''''''''''''''
- Format: Array of inline tables with string keys and values
- `Core metadata`_: ``Author``/``Author-email``/``Maintainer``/``Maintainer-email``
  (`link <https://packaging.python.org/specifications/core-metadata/#author>`__)
- Synonyms

  - Flit_: ``author``/``author-email``/``maintainer``/``maintainer-email``
    (`link <https://flit.readthedocs.io/en/latest/pyproject_toml.html#metadata-section>`__)
  - Poetry_: ``authors``/``maintainers``
    (`link <https://python-poetry.org/docs/pyproject/#authors>`__)
  - Setuptools_: ``author``/``author_email``/``maintainer``/``maintainer_email``
    (`link <https://setuptools.readthedocs.io/en/latest/setuptools.html#metadata>`__)

The people or organizations considered to be the "authors" of the
project. The exact meaning is open to interpretation — it may list the
original or primary authors, current maintainers, or owners of the
package.

The "maintainers" field is similar to "authors" in that its exact
meaning is open to interpretation.

These fields accept an array of tables with 2 keys: ``name`` and
``email``. Both values must be strings. The ``name`` value MUST be a
valid email name (i.e. whatever can be put as a name, before an email,
in `RFC #822`_) and not contain commas. The ``email`` value MUST be a
valid email address. Both keys are optional.

Using the data to fill in `core metadata`_ is as follows:

1. If only ``name`` is provided, the value goes in
   ``Author``/``Maintainer`` as appropriate.
2. If only ``email`` is provided, the value goes in
   ``Author-email``/``Maintainer-email`` as appropriate.
3. If both ``email`` and ``name`` are provided, the value goes in
   ``Author-email``/``Maintainer-email`` as appropriate, with the
   format ``{name} <{email}>``.


``keywords``
''''''''''''
- Format: array of strings
- `Core metadata`_: ``Keywords``
  (`link <https://packaging.python.org/specifications/core-metadata/#keywords>`__)
- Synonyms

  - Flit_: ``keywords``
    (`link <https://flit.readthedocs.io/en/latest/pyproject_toml.html#metadata-section>`__)
  - Poetry_: ``keywords``
    (`link <https://python-poetry.org/docs/pyproject/#keywords>`_)
  - Setuptools_: ``keywords``
    (`link <https://setuptools.readthedocs.io/en/latest/setuptools.html#metadata>`__)

The keywords for the project.

``classifiers``
'''''''''''''''
- Format: array of strings
- `Core metadata`_: ``Classifier``
  (`link <https://packaging.python.org/specifications/core-metadata/#classifier-multiple-use>`__)
- Synonyms

  - Flit_: ``classifiers``
    (`link <https://flit.readthedocs.io/en/latest/pyproject_toml.html#metadata-section>`__)
  - Poetry_: ``classifiers``
    (`link <https://python-poetry.org/docs/pyproject/#classifiers>`__)
  - Setuptools_: ``classifiers``
    (`link <https://setuptools.readthedocs.io/en/latest/setuptools.html#metadata>`__)

`Trove classifiers`_ which apply to the project.

Build back-ends MAY automatically fill in extra trove classifiers
if the back-end can deduce the classifiers from the provided metadata.

``urls``
''''''''
- Format: Table, with keys and values of strings
- `Core metadata`_: ``Project-URL``
  (`link <https://packaging.python.org/specifications/core-metadata/#project-url-multiple-use>`__)
- Synonyms

  - Flit_: ``[tool.flit.metadata.urls]`` table
    (`link <https://flit.readthedocs.io/en/latest/pyproject_toml.html#metadata-section>`__)
  - Poetry_: ``[tool.poetry.urls]`` table
    (`link <https://python-poetry.org/docs/pyproject/#urls>`__)
  - Setuptools_: ``project_urls``
    (`link <https://setuptools.readthedocs.io/en/latest/setuptools.html#metadata>`__)

A table of URLs where the key is the URL label and the value is the
URL itself.

Entry points
''''''''''''
- Format: Table (``[project.scripts]``, ``[project.gui-scripts]``, and
  ``[project.entry-points]``)
- `Core metadata`_: N/A;
  `Entry point specification <https://packaging.python.org/specifications/entry-points/>`_
- Synonyms

  - Flit_: ``[tool.flit.scripts]`` table for console scripts,
    ``[tool.flit.entrypoints]`` for the rest
    (`link <https://flit.readthedocs.io/en/latest/pyproject_toml.html#scripts-section>`__)
  - Poetry_: ``[tool.poetry.scripts]`` table for console scripts
    (`link <https://python-poetry.org/docs/pyproject/#scripts>`__)
  - Setuptools_: ``entry_points``
    (`link <https://setuptools.readthedocs.io/en/latest/setuptools.html#metadata>`__)

There are three tables related to entry points. The
``[project.scripts]`` table corresponds to ``console_scripts`` group.
The key of the table is the name of the entry point and the value is
the object reference.

The ``[project.gui-scripts]`` table corresponds to the ``gui_scripts``
group. Its format is the same as ``[project.scripts]``.

The ``[project.entry-points]`` table is a collection of tables. Each
sub-table's name is an entry point group. The key and value semantics
are the same as ``[project.scripts]``. Users MUST NOT create
nested sub-tables but instead keep the entry point groups to only one
level deep.

Build back-ends MUST raise an error if the metadata defines a
``[project.entry-points.console_scripts]`` or
``[project.entry-points.gui_scripts]`` table, as they would
be ambiguous in the face of ``[project.scripts]`` and
``[project.gui-scripts]``, respectively.

``dependencies``/``optional-dependencies``
''''''''''''''''''''''''''''''''''''''''''
- Format: TBD
- `Core metadata`_: ``Requires-Dist``
  (`link <https://packaging.python.org/specifications/core-metadata/#requires-dist-multiple-use>`__)
- Synonyms

  - Flit_: ``requires`` for required dependencies, ``requires-extra``
    for optional dependencies
    (`link <https://flit.readthedocs.io/en/latest/pyproject_toml.html#metadata-section>`__)
  - Poetry_: ``[tool.poetry.dependencies]`` for dependencies (both
    required and for development),
    ``[tool.poetry.extras]`` for optional dependencies
    (`link <https://python-poetry.org/docs/pyproject/#dependencies-and-dev-dependencies>`__)
  - Setuptools_: ``install_requires`` for required dependencies,
    ``extras_require`` for optional dependencies
    (`link <https://setuptools.readthedocs.io/en/latest/setuptools.html#metadata>`__)

See the open issue on `How to specify dependencies?`_ for a
discussion of the options of how to specify a project's dependencies.

``dynamic``
'''''''''''
- Format: Array of strings
- `Core metadata`_: N/A
- No synonyms

Specifies which fields listed by this PEP were intentionally
unspecified so another tool can/will provide such metadata
dynamically. This clearly delineates which metadata is purposefully
unspecified and expected to stay unspecified compared to being
provided via tooling later on.

- A build back-end MUST honour statically-specified metadata (which
  means the metadata did not list the field in ``dynamic``).
- A build back-end MUST raise an error if the metadata specifies the
  ``name`` in ``dynamic``.
- If the `core metadata`_ specification lists a field as "Required",
  then the metadata MUST specify the field statically or list it in
  ``dynamic`` (build back-ends MUST raise an error otherwise, i.e. a
  required field is in no way listed in a ``pyproject.toml`` file).
- If the `core metadata`_ specification lists a field as "Optional",
  the metadata MAY list it in ``dynamic`` if the expectation is a
  build back-end will provide the data for the field later.
- Build back-ends MUST raise an error if the metadata specifies a
  field statically as well as being listed in ``dynamic``.
- If the metadata does not list a field in ``dynamic``, then a build
  back-end CANNOT fill in the requisite metadata on behalf of the user
  (i.e. ``dynamic`` is the only way to allow a tool to fill in
  metadata and the user must opt into the filling in).
- Build back-ends MUST raise an error if the metadata specifies a
  field in ``dynamic`` but is still unspecified in the final artifact
  (i.e. the build back-end was unable to provide the data for a field
  listed in ``dynamic``).

Example
-------
::

  [project]
  name = "spam"
  version = "2020.0.0"
  description = "Lovely Spam! Wonderful Spam!"
  readme = "README.rst"
  requires-python = ">=3.8"
  license = {file = "LICENSE.txt"}
  keywords = ["egg", "bacon", "sausage", "tomatoes", "Lobster Thermidor"]
  authors = [
    {email = "hi@pradyunsg.me"},
    {name = "Tzu-Ping Chung"}
  ]
  maintainers = [
    {name = "Brett Cannon", email = "brett@python.org"}
  ]
  classifiers = [
    "Development Status :: 4 - Beta",
    "Programming Language :: Python"
  ]

  # Using 'dependencies' and 'optional-dependencies' as an example
  # as those fields' format are an Open Issue.
  dynamic = ["dependencies", "optional-dependencies"]

  [project.urls]
  homepage = "example.com"
  documentation = "readthedocs.org"
  repository = "github.com"
  changelog = "github.com/me/spam/blob/master/CHANGELOG.md"

  [project.scripts]
  spam-cli = "spam:main_cli"

  [project.gui-scripts]
  spam-gui = "spam:main_gui"

  [project.entry-points."spam.magical"]
  tomatoes = "spam:main_tomatoes"


Backwards Compatibility
=======================

As this provides a new way to specify a project's `core metadata`_ and
is using a new table name which falls under the reserved namespace as
outlined in :pep:`518`, there are no backwards-compatibility concerns.


Security Implications
=====================

There are no direct security concerns as this PEP covers how to statically
define project metadata. Any security issues would stem from how tools
consume the metadata and choose to act upon it.


How to Teach This
=================

[How to teach users, new and experienced, how to apply the PEP to their work.]


Reference Implementation
========================

There are currently no proofs-of-concept from any build tools implementing this PEP.


Rejected Ideas
==============

Other table names
-----------------

Anything under ``[build-system]``
'''''''''''''''''''''''''''''''''
There was worry that using this table name would exacerbate confusion
between build metadata and project metadata, e.g. by using
``[build-system.metadata]`` as a table.

``[package]``
'''''''''''''
Garnered no strong support.

``[metadata]``
''''''''''''''
The strongest contender after ``[project]``, but in the end it was
agreed that ``[project]`` read better for certain sub-tables, e.g.
``[project.urls]``.

Support for a metadata provider
-------------------------------
Initially there was a proposal to add a middle layer between the
static metadata specified by this PEP and
``prepare_metadata_for_build_wheel()`` as specified by :pep:`517`. The
idea was that if a project wanted to insert itself between a build
back-end and the metadata there would be a hook to do so.

In the end the authors considered this idea unnecessarily complicated
and would move the PEP away from its design goal to push people to
define core metadata statically as much as possible.

Require a normalized project name
---------------------------------
While it would make things easier for tools to only work with the
normalized name as specified in :pep:`503`, the idea was ultimately
rejected as it would hurt projects transitioning to using this PEP.

Specify files to include when building
--------------------------------------
The authors decided fairly quickly during design discussions that
this PEP should focus exclusively on project metadata and not build
metadata. As such, specifying what files should end up in a source
distribution or wheel file is out of scope for this PEP.

Name the ``[project.urls]`` table ``[project.project-urls]``
------------------------------------------------------------
This suggestion came thanks to the corresponding `core metadata`_
being `Project-Url`. But once the overall table name of `[project]`
was chosen, the redundant use of the word "project" suggested the
current, shorter name was a better fit.

Have a separate ``url``/``home-page`` field
-------------------------------------------
While the `core metadata`_ supports it, having a single field for a
project's URL while also supporting a full table seemed redundant and
confusing.

Recommend that tools put development-related dependencies into a "dev" extra
----------------------------------------------------------------------------
As various tools have grown the concept of required dependencies
versus development dependencies, the idea of suggesting to tools that
they put such development tool into a "dev" grouping came up. In the
end, though, the authors deemed it out-of-scope for this specification
to suggest such a workflow.

Have the ``dynamic`` field only require specifying missing required fields
--------------------------------------------------------------------------
The authors considered the idea that the ``dynamic`` field would only
require the listing of missing required fields and make listing
optional fields optional. In the end, though, this went against the
design goal of promoting specifying as much information statically as
possible.

Different structures for the ``readme`` field
---------------------------------------------
The ``readme`` field had a proposed ``readme_content_type`` field, but
the authors considered the string/table hybrid more practical for the
common case while still accommodating the more complex case. Same goes
for using``long_description`` and a corresponding
``long_description_content_type`` field.

The ``file`` key in the table format was originally proposed as
``path``, but ``file`` corresponds to setuptools' ``file`` key and
there is no strong reason otherwise to choose one over the other.

Allowing the ``readme`` field to imply ``text/plain``
-----------------------------------------------------
The authors considered allowing for unspecified content-types which
would default to ``text/plain``, but decided that it would be best to
be explicit in this case to prevent accidental incorrect renderings on
PyPI and to force users to be clear in their intent.

Other names for ``dependencies``/``optional-dependencies``
----------------------------------------------------------
The authors originally proposed ``requires``/``extra-requires`` as
names, but decided to go with the current names after a survey of
other packaging ecosystems showed Python was an outlier:

1. `npm <https://docs.npmjs.com/files/package.json#optionaldependencies>`__
2. `Rust <https://doc.rust-lang.org/cargo/guide/dependencies.html>`__
3. `Dart <https://dart.dev/guides/packages>`__
4. `Swift <https://swift.org/package-manager/>`__
5. `Ruby <https://guides.rubygems.org/specification-reference/#add_runtime_dependency>`__

Normalizing on the current names helps minimize confusion for people coming from
other ecosystems without using terminology that is necessarily foreign to new
programmers. It also prevents potential confusion with ``requires`` in the
``[build-system]`` table as specified in :pep:`518`.

Support ``Maintainers``/``Maintainers-email``
---------------------------------------------
When discussing how to support ``Authors``/``Authors-email``, the question was
brought up as to how exactly authors differed from maintainers. As this was
never clearly defined and no one could come up with a good definition, the
decision was made to drop the concept of maintainers.

Drop ``maintainers`` to unify with ``authors``
----------------------------------------------
As the difference between ``Authors`` and ``Maintainers`` fields in
the `core metadata`_ is unspecified and ambiguous, this PEP originally
proposed unifying them as a single ``authors`` field. Other ecosystems
have selected "author" as the term to use, so the thinking was to
standardize on ``Author`` in the core metadata as the place to list
people maintaining a project.

In the end, though, the decision to adhere to the core metadata was
deemed more important to help with the the acceptance of this PEP,
rather than trying to introduce a new interpretation for some of the
core metadata.

Support an arbitrary depth of tables for ``project.entry-points``
-----------------------------------------------------------------
There was a worry that keeping ``project.entry-points`` to a depth of 1 for sub-tables
would cause confusion to users if they use a dotted name and are not used to table
names using quotation marks (e.g. ``project.entry-points."spam.magical"``). But
supporting an arbitrary depth -- e.g. ``project.entry-points.spam.magical`` -- would
preclude any form of an exploded table format in the future. It would also complicate
things for build back-ends as they would have to make sure to traverse the full
table structure rather than a single level and raising errors as appropriate on
value types.

Backfilling trove classifiers SHOULD occur instead of MAY happen
----------------------------------------------------------------
Originally this PEP said that tools SHOULD backfill appropriate trove classifiers.
This was changed to say it MAY occur to emphasize it was entirely optional for
build back-ends to implement.

Open Issues
===========

How to specify dependencies?
----------------------------
People seem to fall into two camps on how to specify dependencies:
using :pep:`508` strings or TOML tables (sometimes referred to as the
"exploded table" format due to it being the equivalent of translating
a :pep:`508` string into a table format). There is no question as to
whether one format or another can fully represent what the other can.
This very much comes down to a question of familiarity and (perceived)
ease of use.

Supporters of :pep:`508` strings believe familiarity is important as
the format has been in use for 5 years and in some variant for 15
years (since the introduction of :pep:`345`). This would facilitate
transitioning people to using this PEP as there would be one less new
concept to learn. Supporters also think the format is reasonably
ergonomic and understandable upon first glance, so using a DSL for it
is not a major drawback.

Supporters of the exploded table format believe it has better
ergonomics. Tooling which can validate TOML formats could also help
detect errors in a ``pyproject.toml`` file while editing instead of
waiting until the user has run a tool in the case of :pep:`508`'s DSL.
Supporters also believe it is easier to read and reason (both in
general and for first-time users). They also point out that other
programming languages have adopted a format more like an exploded
table thanks to their use of standardized configuration formats (e.g.
`Rust <https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html>`__,
and `Dart <https://dart.dev/tools/pub/dependencies>`__). The thinking
is that an exploded table format would be more familiar to people
coming to Python from another programming language.

The authors briefly considered supporting both formats, but decided
that it would lead to confusion as people would need to be familiar
with two formats instead of just one.

Copyright
=========

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.


.. _PyPI: https://pypi.org
.. _core metadata: https://packaging.python.org/specifications/core-metadata/
.. _flit: https://flit.readthedocs.io/
.. _poetry: https://python-poetry.org/
.. _setuptools: https://setuptools.readthedocs.io/
.. _setuptools metadata: https://setuptools.readthedocs.io/en/latest/setuptools.html#metadata
.. _survey of tools: https://github.com/uranusjr/packaging-metadata-comparisons
.. _trove classifiers: https://pypi.org/classifiers/
.. _SPDX: https://spdx.dev/
.. _RFC #822: https://tools.ietf.org/html/rfc822

..
   Local Variables:
   mode: indented-text
   indent-tabs-mode: nil
   sentence-end-double-space: t
   fill-column: 70
   coding: utf-8
   End:


7 Likes

This is great! A few thoughts:

  • license - As many projects are dual-licensed, I think file should become files or add a 3rd key.
  • dependencies - Is it forbidden to use this until the format is official?
4 Likes

Is the requires-python entry part of the package’s metadata or a consequence of the language used in the package (if so, then shouldn’t it be determined at package build time and not provided by the user)?

Also, considering the goal of having the metadata storable outside of the package so package indexes could index the package metadata, what are your thoughts on the provides entry (Core metadata spec)? After a bit of thinking, maybe this is worth discussing in a future metadata feature version. (My thoughts came up when I was thinking of the multitude of TOML parsers in PyPI, and wondered if they provide the same API, then which package is imported could be up to the user)

It’s part of the package metadata. The author specifies it. I’m not aware of any tool that determines it automatically by code introspection, although I suppose it’s theoretically possible to write one (in which case, you could use dynamic to allow the field to be added at build time).

Even if such a tool existed, I’d personally strongly recommend not specifying this field dynamically, as it is needed in dependency resolution and not being able to see it in static metadata makes a resolver’s life harder.

1 Like

License is single-use in the core metadata spec. Changing it to multi-use would be a change to that spec (and this PEP would follow that change).

I didn’t see it in the list of discussed/rejected ideas, but what do people think about making the scripts tables tool-specific?

I’m certainly hoping to get there with my backend one day, but will have so much more input information required that it will end up being specified somewhere else anyway. And script entry points feel very much like a backend/distribution specific option anyway, rather than something standardised by Python (while other entry points are standardised by Python).

If you want to just view the metadata as information (that some backends happen to translate into installable files), then I request an equivalent table for all available -m options included with the package :slight_smile:

2 Likes

@brettcannon The embeded text reads horribly, please attach the rendered link in future: https://www.python.org/dev/peps/pep-0621/

Is there a open PR where we can raise our issues directly in-line rather than as a reply here? Can you add that to the post?

2 Likes

How would one start that discussion?

I think in this case it would make the most sense for the project to either put both licenses in a single file.

It was always my understanding that the “License” metadata field was for specifying the name of the license (e.g., “MIT” or “GNU GPL v3”), not the text of the license (which should be stored as files in a wheel’s *.dist-info directory instead) — see, for example, a majority of the “License” fields in PyPI packages indexed by https://pydigger.com/licenses. This PEP seems to be treating the “License” field as a place to store the license text — unless build tools are supposed to put project.license.file in *.dist-info? (Another point of confusion: The synonyms for “license” given for Setuptools are license, license_file, and license_files, but only the first of those sets the “License” field in the resulting metadata; the other two are for copying license files into the sdist and *.dist-info.)

Good initiative!

The mesonpep517 backend uses similar fields in pyproject.toml.

Maybe it can be added in the synonyms as well although of course not all backends needs to be included.

With mesonpep517 the meson.build files are primarily used for the build. Because Meson already requires setting a name and optionally a version and license in the top-level meson.build that is not duplicated in the pyproject.toml. Duplicating name (I think it’s the only mandatory one) is however I think an acceptable compromise.

Inlined version actually reads worse; reST does not translate to Markdown well.

It wasn’t available when I posted because I’m faster than the build process on the PEPs. :slight_smile: But yes, I should have pasted it in with anticipation of it being generated.

There isn’t one. I would rather keep replies in a single spot rather than split across here and some PR somewhere. Highlighting in the literal version above still brings up the “Quote” option so you can still reply here with proper context included.

The whole thing shouldn’t be used as-is until it’s official according to PEP 518. If someone wanted to take the ideas and use it in their own tools section then they can do whatever they want.

Can you give an example of what you’re after here?

https://packaging.python.org/specifications/entry-points/ is the spec for entry points and https://www.python.org/dev/peps/pep-0621/#entry-points covers that, so I’m not sure what you mean by the entry points part not being standardized.

What does that map to in https://packaging.python.org/specifications/core-metadata/? I’m not directly aware of something in there where this ties in and so that would be out of scope for this PEP.

1 Like

https://packaging.python.org/specifications/core-metadata/#license

You’re supposed to use the trove classifiers to specify standard licenses and only specify a license by name if a trove classifier can’t express it. Otherwise you specify the license text using License.

Also note the PEP explicitly leaves open the possibility to specify a SPDX license expression in the future once a PEP actually outlining SPDX license expressions comes about. But since that sort of thing isn’t covered in the core metadata this PEP takes no stance on that, we just know people are interested in seeing that happen.

I like specifying dependencies as the empty string extra name “” for requires-dist and with a non-empty string extra name for optional dependencies, but this has not caught on.

For the same reason, having fewer, more general concepts, I’m surprised scripts are special cased. But there’s a strong argument that they are in fact special compared to every other use of entry points.

Does this format intend to support directly executable .py or .sh scripts?

The format is intended to map directly to core metadata, so given that there’s no special case for executable .py or .sh scripts in the core metadata, then equally there’s no special provision for it in this PEP.

(That argument can be generalised to any case where people want to know “does this PEP support X” - show us where X is supported in the core metadata specs, and that’s essentially your answer :slightly_smiling_face:)

2 Likes

I’m really excited to see this PEP written up!

Under “Entry points”, “MUST not” should probably be “MUST NOT”; this is the only PEP that uses mixed capitalisation for that phrase (9 use all-caps, 44 use all-lowercase).

If a tool recognizes more extensions than this PEP, they MAY infer the content-type for the user.

Does this depends on "readme.content-type" being included in dynamic? (More generally: is listing non-top-level fields in dynamic allowed?)

Tools MAY support other encodings if they choose to.

Per the core metadata specification, “The only legal value [for charset] is UTF-8.” Would it be worth removing this sentence and altering the following two to something like:

Tools MAY support alternative content-types or charsets which they can transform to a format supported by the core metadata [1]. Otherwise tools MUST raise an error for unsupported content-types or charsets.

Interesting! I do think that there’s packages that do both – indicate their license as full text as well as just-the-name.

There are several packages where the license field include the text of the whole license instead of just a few letters indicating the given license.

FWIW, solving the “Python package licenses are a mess” with SPDX-based license declarations is out of scope for this PEP – that’s what “Improving license clarity with better package metadata” is about. That said, as noted in the PEP (and reiterated by @brettcannon here), that is something that we’ve accommodated for – by keeping the string value of the license attribute “open”, so that it mapped to the SPDX-based license metadata field, if/when that gets formalized.

On the scripts, I agree that those are standardised in the output metadata, but I don’t think that necessarily implies they have to be standard in the input metadata. They’re saying far more about “how to install” than “what is being installed”. But at the same time, 99% of the “how” is “what pip does”, so perhaps that’s good enough to canonicalise for all packagers (rather than just backend developers)?

My understanding of the license field is that it’s meant as a fallback for when a classifier is not sufficient. So it ought to remain free text. Though I’d prefer to reference a file/URL rather than have it be the full text - particularly for licenses that include third party acknowledgements (such as Python itself). Those get very long.