I’m spinning this out of PEP 751: now with graphs! to avoid derailing the topic there too much.
Today the Python packaging ecosystem supports dynamic metadata. This in theory can be use with almost any key in pyproject.toml
but most commonly causes problems with version
where it’s also very commonly used. The reasons people like to use dynamic metadata are not too surprising and on the surface it sounds really convenient!
Unfortunately having metadata be dynamic causes challenges and they are a pretty significant tax to the ecosystem. So in some sense it would be nice to be able to do the same thing, but without actually being dynamic.
So I was thinking maybe we have a discussion here why dynamic metadata is used and what alternatives would be, that are not actually dynamic, or at least are not quite as dynamic.
Challenges:
-
Caching: caching of dynamic metadata is hard, because it’s not known what can invalidate it.
-
Build dependencies needed: in order to retrieve dynamic metadata you first need to install the dependencies necessary to run the code that emits that metadata. This can be a rather expensive step which can dramatically slow down package installation. In particular it punishes systems like
uv
that want to automatically re-sync the virtualenv constantly. The time spent resolving metadata dynamically for a single package can be slower than all other operations[1]. -
Unstable metadata: dynamic metadata is often taking external state such as git commits or general git status into account, which results in metadata changing under the hood without the installer or resolver knowing about it. Yet that metadata is in fact frozen in the metadata upon installation. This is particularly odd for editable installs where you might have installed the package at version
0.0.1+deadbeef
, but you are already multiple commits away from that. In the past that often meant that entrypoints no longer found the package, and you had to runpip install --editable
again.
Uses of Dynamic Metadata:
-
version
: pretty commonly people try to make version match the latest git tag, the current git revision hash or similar. In part the motivation here is to also just sync up__version__
in a package with the installed metadata. -
dependencies
(and other keys with similar functionality): a motivating example for this I have seen is to make the attribute match a providedrequirements.txt
file. Sometimes it’s also just set as dynamic because for legacy reasons the requirements are still set insetup.py
. -
readme
: this surprised me, but there is a packagehatch-fancy-pypi-readme
that has some adoption which apparently re-composes the readme from other inputs -
scripts
,gui-scripts
,entry-points
: these are typically set when used with setuptools (eg: whensetup.py
is still in use)
Complicating factors:
- non-installed packages: I don’t know how much of a problem this is still today, but one of the reasons people cannot use the common pattern
__version__ = importlib.metadata.version("MyDist")
is that they support importing a package without (editable or else) installation.
When dynamic metadata is not an issue:
- As more and more packages are published as wheels, dynamic metadata becomes less of an issue as the metadata is frozen within the wheel. It’s thus more of an issue for editable installs and sdist installations.
I intentionally don’t want to prime this topic with proposed solutions or alternatives and see if we can do some basic brainstorming here of what a world would look like that does not have dynamic metadata, or greatly cuts down on how much dynamic metadata is permitted.
Motivating example: you can put your version to an attribute like
package.__version__
with the setuptools backend. Importing that package on some large applications can take seconds, if that root package in itself decides to import part of the system. Sometimes this involves shelling out togit
or other tools to read version information. ↩︎