Oh hai.
I’m back to being subscribed on all things in the Packaging category now. ![:sweat_smile: :sweat_smile:](https://emoji.discourse-cdn.com/apple/sweat_smile.png?v=12)
TL;DR: I think it’s a good idea to exclude sdists from this PEP. There are complicated tradeoffs here, and I’d prefer that those get discussed in a separate standardisation effort.
Adding sdists to a lockfile exposes us to a fairly large set of complications, that I don’t think we should even try to solve, certainly not in the first iteration of this format.
Not only does it require wrangling with build environments on a per-sdist basis, (which is somewhat tractable but not really, as discussed earlier in this thread and later in this post), it also quickly diverges to no longer be merely a Python environment management concern – the build system configuration and possibly even system configuration matter for package builds (eg: packages do compile differently based on what’s available on the system, or based on the platform, or can even be doing random.choice(["foo", "bar"])
for the built artifact’s contents). I don’t think there’s any way we’re solving this in general with the existing structure of our ecosystem, not without adding additional constraints or making additional assumptions here.
As far as I can tell, the only way to solve this in general would be to have some mechanism to validate that a wheel built from an sdist has “the right contents” and having some way to communicate validation information about this between the locker/installer in a platform-agnostic manner. I consider this to be a non-tractable problem.
At the level that our existing tools and standards operate on, the only things that we can reliably control for determinism in a cross-platform manner are incoming sdist artifact (i.e. URL + hash; same for VCS). As for everything else:
- We can’t reliably pin the build-time dependencies of a package (thanks to
get_requires_for_build_wheel
).
- We can’t reliably check that the build system behaved/will behave the exact same way, especially across platforms.
- We can’t reliably check the generated wheel matches what the locker “intended”.
That said, there are relaxations/assumptions that we can make here, which expands what sdists can be considered acceptable.
- Assuming that there won’t be any dynamic behaviours in
get_requires_for_build_wheel
enables pinning the build-time dependencies.
- Making the lockfile limited to a certain platform makes it feasible to require a specific wheel filename from an sdist.
- Requiring the build systems to respect reproducible builds enables adding in hashes for the generated wheels.
FWIW, I think we definitely have projects that are available as sdists, that fit all of these assumptions . Those would effectively be usable without diluting any of the installation determinism and reproducible guarantees of this lockfile (you still lose the no-Python-code-executed semantics, but… that’s a lost cause once you’re using sdists anyway).
The thing is, I don’t think these are safe assumptions to make in general. There are lots of tradeoffs to be considered here, since we’re exchanging {security, determinism, reproducibility} for {workflow flexibility, compatibility with more packages}. I don’t think this PEP locks out the potential for exploring these later, especially since we all agree that any sdist-consumption semantics should be opt-in anyway.
Figuring out the security and usability implications of such assumptions, and making an opinionated choice of which set of tradeoffs to go with here… that is the problem I’d like to punt over to a follow up effort. ![:slight_smile: :slight_smile:](https://emoji.discourse-cdn.com/apple/slight_smile.png?v=12)
FWIW, I think it’s also self-evident that we have users who are already happy with the workflows where the only pinning happening on sdists is that the files’ hash/VCS hash and figuring out entire build story is deferred to the installer. This is provided today by “locked requirements.txt” files and is also what Poetry does in its lockfile format.
So… the answer here might just be “we pin only what we can pin for every sdist (URL + hash), and all other bets are off”. If that’s really what we want, that PEP might have fewer words than this post. ![:stuck_out_tongue: :stuck_out_tongue:](https://emoji.discourse-cdn.com/apple/stuck_out_tongue.png?v=12)
I’m happy that we’re all on the same page here. ![:grin: :grin:](https://emoji.discourse-cdn.com/apple/grin.png?v=12)