Why isn't source distribution metadata trustworthy? Can we make it so?

PEP 517 backends and setuptools (as used in setup.py) generate source distributions containing a PKG-INFO file, which should contain the metadata associated with the package. Currently this information is not used in pip, which opts to get the metadata from the build system. This involves either:

  1. For PEP 517: setup the backend execution environment (possibly installing multiple packages in the process) and executing prepare_metadata_for_build_wheel in a subprocess - which may involve building a wheel if the hook is not executed
  2. For legacy setup.py packages: run setup.py egg_info in a subprocess

The reason we do this is because currently there is no guarantee that PKG-INFO is complete. This is trivially confirmed by inspecting e.g. requests-2.22.0/PKG-INFO from requests-2.22.0.tar.gz vs requests-2.22.0.dist-info/METADATA in requests-2.22.0-py2.py3-none-any.whl. The former is missing several important fields, like Requires-Dist, which are in the latter.

I would like to add a field to the allowed package metadata of source distributions that would signal to metadata processors that the backend does not need to be consulted. Example spelling: Metadata-Covers: all or Metadata-Covers: Name, Version, Requires-Dist, Requires-Python.

This would enable tools like pip to avoid the overhead of creating and tearing down build environments and doing subprocess invocations. The possible benefits increase when considering the upcoming dependency resolver, which may need to download and query metadata for multiple versions of each project.

I checked PEP 566 and the Core metadata specification but didn’t see any such field listed. I’m sure this would have been discussed as part of the development of PEP 517/518, but searching here and distutils-sig didn’t turn up anything specifically mentioning this case (of threads 1, 2, 3, 4).

3 Likes

Is obtaining metadata from requirements expressed as VCS references (git+https://…) considered a related question?

I would assume not, as there’s no already-completed “build” process for such requirements. So we have to assume such data is untrusted, as there’s been no chance to validate it. (Yes, we can say that projects should create a file xxx that contains metadata in this format, but without knowing that the file has been validated, tools can’t rely on it).

So the issue is actually that the generated metadata for wheels and sdists are not the same in case of setuptools. Without having looked into it in much depth I would argue this then to be a bug in the build system setuptools. Indeed, it is a known issue https://github.com/pypa/setuptools/issues/1716.

1 Like

+1 on treating this as “just” a backend bug. The problem is that once lost, trust is hard to regain - how can pip (or any other front end) detect that the backend is trustworthy in this respect?

Requiring backends to add a metadata field that (in effect) says “I don’t have a bug” seems a bit silly (and worse still, it would have to be added to the metadata standard as a required field, to be of any use!) but I can’t think of a better alternative.

There will also be cases where the sdist simply doesn’t know all the metadata for the final wheel, because it varies depending on what happens during the build. So we could think of this proposed field as “I can promise that my wheel metadata is not dynamic and will match the sdist”, rather than just “I don’t have a bug”.

Also if we did this, I think the trick would have to be that if you set this flag, then pip and other build tools need to actually enforce it, by comparing the sdist and wheel metadata and erroring out if they don’t match. That’s the only way to make it actually trustworthy.

But setuptools will never be able to set this flag automatically, because setuptools has no idea whether any given setup.py has tricky dynamicity in it. Which means that this flag would have to be something that individual projects have to opt-in to. Which is fine for projects that have active and diligent maintainers. … But those projects mostly distribute wheels already, so this flag is unnecessary. The projects that need it are the ones that only distribute sdists. Some of those projects do have active maintainers that could potentially be convinced to add this flag. But I think to make a real dent in the missing-metadata problem, you’ll need to find something that works for the inactive-but-still-used projects, and an opt-in flag won’t help with those.

2 Likes

Yes, exactly.

I’m also on board for enforcing metadata consistency. That is similar to an issue I filed here.

There are other use cases that would make this worthwhile, specifically:

  • Users that pass --no-binary :all:, where we won’t consider remote wheels
  • Users on platforms that don’t have pre-built wheels

The primary focus for me is coming to a conclusion on whether we can make anything about this PKG-INFO useful, or rule it out entirely. An opt-in flag is the only way I see forward on that front. I agree it will not help inactive-but-used projects.

This is not entirely true. There is an open issue to populate Requires-Dist for sdists if-and-only-if install_requires is specified in setup.cfg and not in setup.py.

Of course, one can make the argument that a setup.py could do this:

...

if some_condition:
  kwargs['install_requires'] = ["something"]

setup(**kwargs)

Even if install_requires is specified in setup.cfg. Even if we’re forced to consider this a possibility and not auto-set the flag, we have other options of decreasing value, e.g. only set the flag if the setup.py is generated by setuptools itself, or use the ast module to parse setup.py and only set the flag if setup() is called with enumerated options and without install_requires.

In any case, it’s definitely true that it’s somewhat tricky, but if we combine some zero-false-positive heuristics with education and documentation about the use of declarative metadata, we may get to a world where for the most part, even source distributions have reliable dependency metadata in setuptools.

One thing that I have no feel at all for, is how significant a proportion of projects fall into this category. Are we talking about 50% of downloads from PyPI? Or 10%? Or 1%? Are download counts important, or would some other measure better capture “importance” here?

For me, this is a classic 80-20 style of problem, if we could benefit 80% of the cases, I’d be happy with that. The added wrinkle, though, is that we have no real idea where to draw the line between the 80 and the 20. So we too often end up paralysed, unable to make progress because we can’t judge the importance of the use cases we’re considering.

Not entirely true, there are people who install with --no-binary, for example. As well as people on platforms where wheels aren’t available (am I right that Docker images that use musl don’t have wheels, for example?). Again, a better understanding of use cases would help here.

1 Like

If we’re going this far, might as well start doing “static evaluation” of setup.py to check if there’s anything dynamic happening in setup.py – @techalchemy had something for this if I remember correctly.

I do indeed. It’s very poorly implemented though. It’s probably imperfect but it is defintinely possible to traverse the AST for this information. It gets tricky because sometimes people import setup under an alias, e.g. from setuptools import setup as do_stuff (i’ve seen approximations of this) and I’ve even seen people rely on directory-local imports of their own code which imports setuptools.setup on one occasion (from .mymodule import my_version_of_setup which in turn called setuptools.setup). I do not believe my code handles that case :slight_smile:

This is an interesting conversation and one bit of information I would like to add that may be relevant is that I was at a packaging summit hosted by Microsoft recently with folks from npm, go, Java (maven/gradle), OCI, NuGet and a few others and the overarching theme seemed to be enforcement – putting tools in front of the upload process, whether they are strict enforcement or simply encourage the desired behaviors (someone from github suggested that they could fail a check if their wheels lacked metadata after a build). Rather than trusting that the user supplied good metadata, there was a lot of interest in actually validating or if possible generating the metadata at the index.

This is obviously super nuanced and I’m hand waving tons of complexity but I think we are relatively smart and we can probably get a basic solution working. It’s in keeping with what we discussed at PyCon last year and it ultimately all comes down to metadata.

As of last month I accepted a partly sponsored role with Canonical and I’ll be spending a chunk of my time on packaging related work so I’ll be glad to catch up on these. I believe @pradyunsg and I were supposed to draft a PEP related to extras based on some of the work @njs had done as an outcome of PyCon last year, but due to a lot of factors I hadn’t had any time to do anything open source related. Now that I have time I’d be glad to pick that back up (I’m sure it’s discussed on discourse somewhere).

To the original question, I’d suggest caution around making adjustments to metadata PEPs ahead of the resolver work in pip unless we are prepared to tackle the full extent of the issue surrounding our current metadata representations (see the previous paragraph about extras). If that’s something we are willing to tackle head on, I think it does make sense to do that first, however.

Sorry for the many words but hopefully that was mostly on-topic and clear.

I’m going to reiterate a point I’ve made elsewhere on this, though. How far should we go to support such usages? What requirements drive the use of such unusual approaches for the projects using them, and are those requirements sufficiently compelling to justify the significant amount of extra work required of the packaging tool community to cater for those usages?

I strongly believe we should avoid getting trapped in a mindset that says that we have to support absolutely every usage of setuptools imaginable, across all packaging tools. If the requirement for a particular project is strong enough, “use an older version of pip” is an option - and if the cost to the project of doing that is too high, then maybe the cost of supporting that usage in the packaging tools should also be considered too high.

We have to be cautious here - breaking backward compatibility should never be something we do lightly - but we should have the option available when it’s needed.

1 Like

Yea. I’m going to say that a system that covers the basic case – a literal defined in setup.py should be considered canonical.

Anything else, we can tackle that if we see the need to.

Completely agreed

Without getting too much in the weeds anymore we are basically on the same page. Ultimately (and I realize this position may still be controversial) I think we need to move as far away from executable package manifests as possible. I.e. define metadata in one place, and, if needed, build extensions in another. As long as we are stuck asking the question “do we need to write an AST parser for reading install_requires information or should I run python setup.py egg_info and parse the resultant metadata?” we are going to be building these overly complicated workarounds just to get basic metadata.

So what would we need to do in order to make that happen? What are the major barriers?

I’m saying, setuptools should do these shenanigans, to determine if the metadata from setup.py is “stable”. A field added to the metadata specification for declaring how sdists can have “stable” dependency data would be good to have too.

1 Like