tl;dr I’d like to record in wheel and dist-info metadata where a custom-built wheel is from. For tracking backported security fixes.
In Fedora Linux (and in RHEL) we build our own pip and setuptools wheels for the purposes of the ensurepip module, virtualenv, and Python’s test suite.
We package this in RPM packages, and the file is available e.g. at /usr/share/python-wheels/setuptools-69.2.0-py3-none-any.whl. When a new virtual environment is created, pip (and potentially setuptools) is installed from there. This is supported by CPython’s --with-wheel-pkg-dir= configure option.
When there is a security vulnerability in pip or setuptools, we usually backport the fix, like we did recently with CVE-2025-47273 for setuptools. The wheel at /usr/share/python-wheels/setuptools-69.2.0-py3-none-any.whl on my system is from the python-setuptools-wheel-69.2.0-10.fc41.noarch RPM package, which contains the backported fix.
When I create a new virtual environment with setuptools, the setuptools in that virtual environment is not vulnerable to CVE-2025-47273. However, a security scanner might say it is vulnerable because all it sees is that I have setuptools version 69.2.0. The information that it originated from the python-setuptools-wheel-69.2.0-10.fc41.noarch RPM package is lost. The version of that package I have installed on my system is not a reliable source of that information because I could have created that virtual environment with an older version of that RPM, or I could have installed setuptools from PyPI.
Therefore, I’d like to record the origin somehow so that the security vulnerabilities scanners can figure it out.
I thought of:
Using the wheel Build tag. Unfortunately, that is only an integer. We could put there the number 10 from 69.2.0-10.fc41.noarch but if more distributors do that, the numbers might clash with each other.
Using a local version label. I.e. version the wheel as 69.2.0+10.fc41. I have not tested this approach yet. Again, unless the local version is very specific, it could clash with various distributions. For Fedora and RHEL we would use the full RPM Release field here, which is quite specific (and I’d say unique across our ecosystem).
Adding a new core metadata field (such as Origin) which could say e.g. python-setuptools-wheel-69.2.0-10.fc41.noarch.rpm or some other standardized string that identifies where the wheel was built from (e.g. PURL similarly to PEP 725).
Adding a new custom dist-info file (such as ORIGIN) with the above.
What would you think is the best approach? Would adding an additional core metadata field or file for this be a generally acceptable use case?
(Side note: Happy to discuss this more at EuroPython as well.)
I think PEP 770 is applicable here, you can create an SBOM that has a single component (that is also the primary component), the project contained within the wheel, with software identifiers that distinguish the component as being not from the Python Package Index:
Example above using a CycloneDX component. Then scanners that support PEP 770 scanning that environment can “know” the software identifier instead of inferring that the package is from PyPI.
Out of curiosity, is there a reason Fedora (and RHEL) aren’t using local version identifiers when carrying backports?
If I’m understanding the motivation correctly, this would be a pre-existing solution to your vulnerability scanner problem: PEP 440 defines the local identifier component explicitly so that integrators/downstreams can perform backports, and vulnerability scanners should be checking that component and skipping the package if there’s an indication that it isn’t the “true” version.
For example, that’s what pip-audit does – if the user presents a dependency like foopkg-1.0.0+ubuntu.0 then we assume that the local version has diverged from any vulnerability information we could retrieve for foopkg-1.0.0 from PyPI.
There is no reason except that we have never done that before. It’s something that comes up now and than and was one of the solutions to the problem I was considering (it’s actually listed as the second option).
EDIT: i guess that one thing that bothered me with this approach is the need to adjust the version when we build the wheel for it. If we wanted all Fedora packages to carry the +10.fc41 local version identifier, wouldn’t we need a build-backed agnostic way of doing it?
Whoops, I completely missed that in the original post. My apologies!
For my 0.02c the local version identifier is what I’d recommend – it’s what other distros (Debian and Ubuntu, and probably others?) are already doing to communicate that a version has been modified with backports. And with my pip-audit maintainer hat on, it’s a lot easier (and faster) for me to handle than having to poke into each distribution’s wheel or dist-info metadata, e.g. in lockfile-only settings
If my understanding of the Release field in RPM is right, I think that would be more than sufficient for uniqueness
I also think that SBOM sounds like the best solution for your specific issue.
That said, to answer a few other points:
For the record, if you were to pursue this route, then the *.dist-info/WHEEL may be a better choice, since it’s information specific to the wheel.
Do you happen to know how well that works for them? I think I’ve seen packages fail when local version identifier is used — mostly when they try to parse their own version, and don’t expect the +.
I’m not involved in Ubuntu/Debian packaging at all (besides as a user) so I can’t say anything assertively there, but it definitely appears to be working from my downstream vantage point – pip-audit receives occasional reports about “missed” findings on local version identifiers, which indicate at least some degree of successful adoption.
We (Debian) use a scheme like local versions to communicate our own revisions. But that’s for our native package format, not the Python packaging version. Almost universally, those are left untouched.
In the cases historically, where the Debian version was exposed as the Python package version, this broke newer Python tools once PEP-440 got enforced.
I guess a local version identifier is indeed too opaque.
If we have setuptools version 69.2.0+10.fc41, then any code (we don’t control) that tries to parse that with some naïve split-on-dot code or that compares versions as strings for equality will break.
That’s why I would prefer less opaque metadata for this. The SBOM looks like a nice solution.