PEP 751: lock files (again)

brettcannon · August 8, 2024, 9:50pm

Yes.

That’s doable as a per-file locking scenario since you have a finite set of environments to support.

Correct, hence the support for sdists and locking the build requirements.

Correct, and that’s supported.

…

Correct. Some people want maximum information to understand why something was included or what influence something had on a lock, while others like yourself don’t. I made non-critical info optional for that reason.

And my PoC can lock for multiple platforms simultaneously, so it’s doable and part of the design that it’s possible.

Yes, because there was an ask to see if they could be closer together to alleviate differences.

That seems right.

I’m still thinking through the best way to represent that situation. It’s probably keying on one and then listing what applies to a file for the second dimension.

The problem I realized last night is the envs ✕ groups matrix; you can’t guarantee that if a file (or package version) makes sense for an environment and dependency group pair that it holds for all groups under that environment (and vice-versa); metadata could vary between files such that it doesn’t hold. So it probably requires two levels to specify the env and group (there’s a reason the PEP currently doesn’t try to tackle dependency groups and expects people to create different lock files per dependency group in that case).

One thing I want to try and clarify is why there’s this distinction between per-file and per-package locking, and it comes down to why you’re locking stuff to begin with.

I think there are two reasons why people typically want reproducible environments: consistency and security. Now, if you just want consistency (e.g., everyone has the same package versions when they build the docs), then you just need a file format that can encode that information. Whether it’s especially readable or not isn’t a major concern as long as the results of using that lock file lead to a consistent outcome. This also lends itself to trying to be open-ended about which environments you support. This is very much a “get stuff done” side of things.

But for security, you need to be able to audit what’s going on in the file. Now you could use tools to help with that and thus continue to not care about how the file format looks, but we all know it’s easy to just say, “eh, the diff looks good” and approve/merge a pull request w/o running some tool. But if you lower the barrier of understanding what’s going on in a lock file, I hope it at least increases how much people would pay attention to what’s going on. This is the “every detail matters” side.

And those two goals of consistency and security line up with per-package and per-file locking, respectively. And since I’m coming at this from a desire for security, it’s why the PEP talks about readable diffs, trying to keep information together so you don’t have to scroll around a file to understand things (and thus miss key details or get lazy about checking something by keeping the cognitive load down), simple install semantics, requiring hash verification, including details to create SBOMS, etc.

My hope is we can somehow come up w/ a format that meets the goal of both reasons people want reproducible environments. I’m giving it this week to see if that’s possible. But if it’s not then I am willing to strip my PEP down to just security-focused, per-file locking like I originally proposed before my parental leave and having a separate PEP for just consistency, open-ended environment support, per-package locking (whether I’m involved in that PEP is an open question). I will probably poll this group next week if there isn’t obvious consensus as to what direction to take by then.