Supporting sdists and source trees in PEP 665

No, because a consumer that only supports wheels (i.e., it supports what the PEP defines and nothing more) should be able to reject lockfiles that it can’t handle before doing a bunch of work trying to install stuff. Such a consumer can’t check the [tool] section for that, because the PEP explicitly assigns no semantics to that section.

To be honest, I don’t see why you’re trying so hard to avoid just saying “the PEP requires all sources to be wheels”. Doing so doesn’t make it any harder to write a follow-up PEP adding sdist support, and people wanting to experiment before developing a sdist support PEP can do whatever they want, they just can’t claim that the lockfiles they use follow the standard.

I think this is the most reasonable position to take at this point. If no-one offers a proposal by tomorrow (25th), then I assume you’ll declare sdist support out of scope for PEP 665, and update the PEP accordingly, and this discussion can move back to the main PEP 665 topic (where I’ll be happy to make the case that the PEP should explicitly require sources to be wheels, on the grounds that the PEP has deliberately rejected supporting non-wheel sources :wink:).

I’ve asked @pradyunsg and @uranusjr privately if they are okay with this, but I have not heard back from them. But yes, what you outlined in my assumption of what will happen if no one steps forward.

1 Like

Oh hai.

I’m back to being subscribed on all things in the Packaging category now. :sweat_smile:

TL;DR: I think it’s a good idea to exclude sdists from this PEP. There are complicated tradeoffs here, and I’d prefer that those get discussed in a separate standardisation effort.

Adding sdists to a lockfile exposes us to a fairly large set of complications, that I don’t think we should even try to solve, certainly not in the first iteration of this format.

Not only does it require wrangling with build environments on a per-sdist basis, (which is somewhat tractable but not really, as discussed earlier in this thread and later in this post), it also quickly diverges to no longer be merely a Python environment management concern – the build system configuration and possibly even system configuration matter for package builds (eg: packages do compile differently based on what’s available on the system, or based on the platform, or can even be doing random.choice(["foo", "bar"]) for the built artifact’s contents). I don’t think there’s any way we’re solving this in general with the existing structure of our ecosystem, not without adding additional constraints or making additional assumptions here.

As far as I can tell, the only way to solve this in general would be to have some mechanism to validate that a wheel built from an sdist has “the right contents” [1] and having some way to communicate validation information about this between the locker/installer in a platform-agnostic manner. I consider this to be a non-tractable problem.

At the level that our existing tools and standards operate on, the only things that we can reliably control for determinism in a cross-platform manner are incoming sdist artifact (i.e. URL + hash; same for VCS). As for everything else:

  • We can’t reliably pin the build-time dependencies of a package (thanks to get_requires_for_build_wheel).
  • We can’t reliably check that the build system behaved/will behave the exact same way, especially across platforms.
  • We can’t reliably check the generated wheel matches what the locker “intended”.

That said, there are relaxations/assumptions that we can make here, which expands what sdists can be considered acceptable.

  • Assuming that there won’t be any dynamic behaviours in get_requires_for_build_wheel enables pinning the build-time dependencies.
  • Making the lockfile limited to a certain platform makes it feasible to require a specific wheel filename from an sdist.
  • Requiring the build systems to respect reproducible builds enables adding in hashes for the generated wheels.

FWIW, I think we definitely have projects that are available as sdists, that fit all of these assumptions [2]. Those would effectively be usable without diluting any of the installation determinism and reproducible guarantees of this lockfile (you still lose the no-Python-code-executed semantics, but… that’s a lost cause once you’re using sdists anyway).

The thing is, I don’t think these are safe assumptions to make in general. There are lots of tradeoffs to be considered here, since we’re exchanging {security, determinism, reproducibility} for {workflow flexibility, compatibility with more packages}. I don’t think this PEP locks out the potential for exploring these later, especially since we all agree that any sdist-consumption semantics should be opt-in anyway.

Figuring out the security and usability implications of such assumptions, and making an opinionated choice of which set of tradeoffs to go with here… that is the problem I’d like to punt over to a follow up effort. :slight_smile:

FWIW, I think it’s also self-evident that we have users who are already happy with the workflows where the only pinning happening on sdists is that the files’ hash/VCS hash and figuring out entire build story is deferred to the installer. This is provided today by “locked requirements.txt” [3] files and is also what Poetry does in its lockfile format.

So… the answer here might just be “we pin only what we can pin for every sdist (URL + hash), and all other bets are off”. If that’s really what we want, that PEP might have fewer words than this post. :stuck_out_tongue:

I’m happy that we’re all on the same page here. :grin:

  1. Ideally, this is something that can’t be “spoofed” easily. Expecting reproducible builds would allow using the hash of the final artifact for this. ↩︎

  2. For example: an sdist that has no dynamic build dependencies, respects reproducible builds and generates a single platform-agnostic wheel is basically something that we can “lock” to generate the exact same wheel each time. This sdist will basically always generate the same wheel with reproducible builds, which is true for all flit-based projects and all (pure-Python) setuptools-based projects without custom build steps. ↩︎

  3. As generated by pip-compile – pinned with hashes. ↩︎


I just updated the PEP with sdist support as a rejected idea. I’m going to lock this topic and have it point back to the “take 2” topic so that any future discussion can be done from the perspective of PEP 665 accepted/rejected.

Thanks everyone for discussing this!

sdists in PEP 665 have been rejected. See PEP 665, take 2 -- A file format to list Python dependencies for reproducibility of an application - #93 by brettcannon .