We’ve been working on a proposal for standardized plugins providing dynamic metadata for build backends. Initial work was done in https://github.com/scikit-build/scikit-build-core/issues/230, and now moving it here. This includes two changes to metadata specification that comes from PEP 621 - adding support for a table in project.dynamic
, and loosening the requirements a bit on combing dynamic and static metadata (I see https://discuss.python.org/t/relaxing-or-clarifying-pep-621-requirements-regarding-dynamic-dependencies discussion this and stating a new PEP would be needed!). I’m putting the initial draft inline.
Note that “rejected ideas” aren’t technically rejected yet, and if one of those is deemed better, the proposal can be modified to swap the proposed and rejected ideas.
Also it’s not quite in the right form for a PEP just yet, it needs an abstract / motivation / rational / specification, which all there, but not quite in the proper headings / order. Of course I think of that right after posting… But I’m assuming there will be discussion, suggestions, etc. leading to changes.
Dynamic metadata plugins
Need
In the core metadata specification originally set out in PEP 621 there is the possibility of marking fields as “dynamic”, allowing their values to be determined at build time rather than statically included in pyproject.toml
. There are several popular packages which make use of this system, most notably setuptools_scm
, which dynamically calculates a version string based on various properties from a project’s source control system, but also e.g. hatch-fancy-pypi-readme
, which builds a readme out of user-defined fragments (like the latest version’s CHANGELOG). Most backends, including setuptools, PDM-backend, hatchling, and flit-core, also have built-in support for providing dynamic metadata from sources like reading files.
With the recent profusion of build-backends in the wake of PEPs 517 and 518, it is much more difficult for a user to keep using these kind of tools across their different projects because of the lack of a common interface. Each tool has been written to work with a particular backend, and can only be used with other backends by adding some kind of adapter layer. For example, setuptools_scm
has already been wrapped into a hatchling
plugin (hatch-vcs
), and into scikit-build-core
. Poetry also has a custom VCS versioning plugin (poetry-dynamic-versioning
), and PDM has a built-in tool for it. However, these adapter layers are inconvenient to maintain (often being dependent on internal functions, for example), confusing to use, and result in a lot of duplication of both code and documentation.
We are proposing a unified interface that would allow metadata providing tools to implement a single function that build backends can call, and a standard format in which to return their metadata. Once a backend chooses to adopt this proposed mechanism, they will gain support for all plugins implementing it.
We are also proposing a modification to the project specification that has been requested by backend and plugin authors to loosen the requirements slightly on mixing dynamic and static metadata, enabling metadata plugins to be more easily adopted for some use cases.
Proposal
Implementing a metadata provider
Our suggestion is that metadata providers include a module (which could be the top level of the package, but need not be) which provides a function dynamic_metadata(fields, settings=None)
. The first argument is the list of fields requested of the plugin, and the second is the extra settings passed to the plugin configuration, possibly empty. This function will run in the same directory that build_wheel()
runs in, the project root (to allow for finding other relevant files/folders like .git
).
The function should return a dictionary matching the pyproject.toml
structure, but only containing the metadata keys that have been requested. dynamic
, of course, is not permitted in the result. Updating the pyproject_dict
with this return value (and removing the corresponding keys from the original dynamic
entry) should result in a valid pyproject_dict
. The backend should only update the key corresponding to the one requested by the user. A backend is allowed (and recommended) to combine identical calls for multiple keys - for example, if a user sets “readme” and “license” with the same provider and arguments, the backend is only required to call the plugin once, and use the readme
and license
fields.
An optional hook[1], get_requires_for_dynamic_metadata
, allows providers to determine their requirements dynamically (depending on what is already available on the path, or unique to providing this plugin).
Here’s an example implementation:
def dynamic_metadata(
fields: Sequence[str],
settings: Mapping[str, Any],
) -> dict[str, dict[str, str | None]]:
if settings:
raise RuntimeError("Inline settings are not supported by this plugin")
if fields != ["readme"]:
raise RuntimeError("This plugin only supports dynamic 'readme'")
from hatch_fancy_pypi_readme._builder import build_text
from hatch_fancy_pypi_readme._config import load_and_validate_config
with Path("pyproject.toml").open("rb") as f:
pyproject_dict = tomllib.read(f)
config = load_and_validate_config(
pyproject_dict["tool"]["hatch"]["metadata"]["hooks"]["fancy-pypi-readme"]
)
return {
"readme": {
"content-type": config.content_type,
"text": build_text(config.fragments, config.substitutions),
}
}
def get_requires_for_dynamic_metadata(
settings: Mapping[str, Any] | None = None,
) -> list[str]:
return ["hatch-fancy-pypi-readme"]
Using a metadata provider
For maximum flexibility, we propose specifying a 1:1 mapping between the dynamic
metadata fields and the providers (specifically the module implementing the interface) which will supply them.
The existing dynamic specification will be expanded to support a table as well:
[project.dynamic]
version = {provider = "plugin.submodule"} # Plugin
readme = {provider = "local_module", provider-path = "scripts/meta"} # Local plugin
classifiers = {provider = "plugin.submodule", max="3.11"} # Plugin with options
requires-python = {min = "3.8"} # Build-backend specific
dependencies = {} # Identical to dynamic = ["dependences"]
optional-dependences.provider = "some_plugin" # Shortcut for provider =
If project.dynamic
is a table, a new provider="..."
key will pull from a matching plugin with the hook outlined above. If provider-path ="..."
is present as well, then the module is a local plugin in the provided local path (just like PEP 517’s local backend path). All other keys are passed through to the hook; it is suggested that a hook validate for unrecognized keys. If no keys are present, the backend should fall back on the same behavior a string entry would provide.
Many backends already have some dynamic metadata handling. If keys are present without provider=
, then the behavior is backend defined. It is highly recommended that a backend produce an error if keys that it doesn’t expect are present when provider=
is not given. Setuptools could simply its current tool.setuptools.dynamic
support with this approach taking advantage of the ability to pass custom options through the field:
# Current
[project]
dynamic = ["version", "dependencies", "optional-dependencies"]
[tool.setuptools.dynamic]
version = {attr="mymod.__version__"}
dependencies = {file="requeriments.in"}
optional-dependencies.dev = {file="dev-requeriments.in"}
optional-dependencies.test = {file="test-requeriments.in"}
# After
[project.dynamic]
version = {attr="mymod.__version__"}
dependencies = {file="requeriments.in"}
optional-dependencies.dev = {file="dev-requeriments.in"}
optional-dependencies.test = {file="test-requeriments.in"}
# "provider = "setuptools.dynamic.version", etc. could be set but would be verbose
Another idea is a hypothetical regex based version discovery, which could look something like this if it was integrated into the backend:
[project.dynamic]
version = {location="src/package/version.txt", regex='Version\s*([\d.]+)'}
Or like this if it was a plugin:
[project.dynamic.version]
provider = "regex.searcher.version"
location = "src/package/version.txt"
regex = 'Version\s*([\d.]+)'
Using project.dynamic
as a table keeps the specification succinct without adding extra fields, it avoids duplication, and it is handled by third party libraries that inspect the pyproject.toml exactly the same way (at least if they are written in Python). The downside is that it changes the existing specification, probably mostly breaking validation - however, this is most often done by the backend; a backend must already opt-into this proposal, so that is an acceptable change. pip
and cibuildwheel
, two non-backend tools that read pyproject.toml, are unaffected by this change.
Supporting metadata providers:
An implementation of this proposal already exists for the scikit-build-core
backend and uses only standard library functions. Implementations could be left up to individual build backends to provide but if the proposal were to be adopted then would probably coalesce into a single common implementation. pyproject-metdata
could hold such a helper implementation.
Proposed changes in the semantics of project.dynamic
PEP 621 explicitly forbids a field to be “partially” specified in a static way (i.e. by associating a value to project.<field>
in pyproject.toml
) and later listed in dynamic
.
This complicates the mechanism for dynamically defining fields with complex/compound data structures, such as keywords
, classifiers
and optional-metadata
and requires backends to implement “workarounds”. Examples of practices that were impacted by this restriction include:
-
whey
’s re-implementation ofclassifiers
in atool
subtable (dependencies
too!) - the removal of the
classifiers
augmentation feature inpdm-backed
. -
setuptools restrictions on dynamic
optional-dependencies
In this PEP, we propose to lift this restriction and change the semantics associated with pyproject.dynamic
in the following manner:
- When a metadata field is simultaneously assigned a value and included in
pyproject.dynamic
, tools should assume that its value is partially defined. The given static value corresponds to a subset of the value expected after the build process is complete. Backends and dynamic providers are allowed augment the metadata field during the build process.
The fields that are arrays or tables with arbitrary entries are urls
, authors
, maintainers
, keywords
, classifiers
, dependencies
, scripts
, entry-points
, gui-scripts
, and optional-dependencies
.
Examples & ideas:
- Version computation from VCS
- Building description out of parts of a readme & other files
- Pulling metadata from another build system (like CMake, Meson, Cargo) - these can be “integrated” plugins; setuptools’s
dynamic
table could use this instead. - Automatic classifier computation
- Local metadata scripts
- Dependency addition (extras, or adding specific dependencies for specific wheels)
Current PEP 621 backends & dynamic metadata
Backend | Dynamic? | Config? | Plugins? |
---|---|---|---|
setuptools | |||
hatchling | |||
flit-core | |||
pdm-backend | |||
scikit-build-core | [2] | ||
meson-python | |||
maturin | |||
enscons | |||
whey | |||
trampolim |
“Dynamic” indicates the tool supports at least one dynamic config option. “Config” indicates the tool has some tool-specific way to configure this option. “Plugins” refers to having a custom plugin ecosystem for these tools. Poetry has not yet adopted PEP 621, so is not listed above, but it does have dynamic metadata with custom configuration and plugins. This proposal will still help tools not using PEP 621, as they can still use the plugin API, just with custom configuration (but they are already using custom configuration for everything else, so that’s fine).
Rejected ideas
Notes on extra file generation
Some metadata plugins generate extra files (like a static version file). No special requirements are made on such plugins or backends handling them in this proposal; this is inline with PEP 517’s focus on metadata and lack of specifications file handling.
Config-settings
The config-settings dict could be passed to the plugin, but due to the fact there’s no standard configuration design for config-settings, you can’t have generally handle a specific config-settings item and be sure that no backend will also try to read it or reject it. There was also a design worry about adding this in setuptools, so it was removed (still present in the reference implementation, though).
Passing the pyproject.toml as a dict
This would add a little bit of complexity to the signature of the plugin, but would avoid reparsing the pyproject.toml for plugins that need to read it. Also would avoid an extra dependency on tomli
for older Python versions. Custom inline settings alleviated the need for almost every plugin to read the pyproject.toml, so this was removed to keep backend implementations & signatures simpler.
Shortcut for just selecting a provider
To keep the most common use case simple[^4], passing a string is equivalent to passing the provider; version = "..."
is treated like version = { provider = "..." }
. This makes the backend implementation a bit more complex, but provides a simpler user experience for the most common expected usage. This is similar to the way to how keys like project.readme =
and project.license =
are treated today. This was rejected since adding .provider
works in TOML and keeps it explicit.
New section
Instead of changing the dynamic metadata field to accept a table, instead there could be a new section:
dynamic = ["version"]
[dynamic-metadata]
version = {provider = "plugin_package.submodule"}
This is the current state of the reference implementation, using [tool.scikit-build.metadata]
instead of [dynamic-metadata]
. In this version, listing an item in dynamic-metadata should be treated as implicitly listing it in dynamic, though listing in both places can be done (primary for backward compatibility).
dynamic
vs. dynamic-metadata
could be confusing, as they do the same thing, and it actually makes parsing this harder for third-party tools, as now both project.dynamic
and dynamic-metadata
have to be combined to see what fields could be dynamic. The fact that dict keys and lists are handled the same way in Python provides a nice method to avoid this complication.
Alternative proposal: new array section
A completely different approach to specification could be taken using a new section and an array syntax[3]:
dynamic = ["version"]
[[dynamic-metadata]]
provider = "plugin_package.submodule"
provider-path = "src"
provides = ["version"]
This has the benefit of not repeating the plugin if you are pulling multiple metadata items from it, and indicates that this is only going to be called once. It also has the benefit of allowing empty dynamic plugins, which has an interesting non-metadata use case, but is probably out of scope for the proposal. The main downside is that it’s harder to parse for the dynamic
values by third party projects, as they have to loop over dynamic-metadata
and join all provides lists to see what is dynamic
. It’s also a lot more verbose, especially for the built-in plugin use case for tools like setuptools. (The current version of this suggestion listed above is much better than the original version we proposed, though!). This also would allow multiple plugins to provide the same metadata field, for better (maybe this could be used to allow combining lists or tables from multiple plugins) or worse (this has to be defined and properly handled).
This version could enable a couple of possible additions that were not possible in the current proposal. However, most users would not need these, and some of them are a bit out of scope - the current version is simpler for pyproject.toml authors and would address 95% of the plugin use cases.
Multiple plugins per field
The current proposal requires a metadata field be computed by one plugin; there’s no way to use multiple plugins for a single field (like classifiers). This is expected to be rare in practice, and can easily be worked around in the current proposal form by adding a local plugin that itself calls the plugins it wants to combine following the standard API proposed. “Merging” the metadata then would be arbitray, since it’s implemented by this local plugin, rather than having to be pre-defined here.
Empty plugins (for side effects)
A closely related but separate could be solved by this paradigm as well with some modifications. Several build tools (like cmake
, ninja
, patchelf
, and swig
) are actually system CLI tools that have optional pre-compiled binaries in the PyPI ecosystem. When compiling on systems that do not support binary wheels (a very common reason to compile!), such as WebAssembly, Android, FreeBSD, or ClearLinux, it is invalid to add these as dependencies. However, if the system versions of these dependencies are of a sufficient version, there’s no need to add them either. A PEP 517 backend has the ability to declare dynamic dependencies, so this can be (and currently is) handled by tools like scikit-build-core
and meson-python
in this way. However, it might also be useful to allow this logic to be delegated to a metadata provider, this would potentially allow greater sharing of core functionality in this area.
For example, if you specified “auto_cmake” as a provider, it could provide get_requires_for_dynamic_metadata
to supply this functionality to any backend. This will likely best be covered by the “extensionlib” idea, rather than plugins, so this is not worth trying to address unless this array based syntax becomes the proposed syntax - then it would be worth evaluating to see if it’s worth trying to include.
-
Most plugins will likely not need to implement this hook, so it could be removed. But it is symmetric with PEP 517, fairly simple to implement, and “wrapper” plugins, like the first two example plugins, need it. It is expected that backends that want to provide similar wrapper plugins will find this useful to implement. ↩︎
-
In development, based on a version of proposal. ↩︎
-
Note, that unlike the proposed syntax, this probably should not repurpose
project.metadata
, since this would be much more likely to break existing parsing of this field by static tooling. (Static tooling often may not parse this field anyway, since it’s easier to check for a missing field - you only need to check the dynamic today if you care about “missing” version “specified elsewhere”.) ↩︎