How should a lockfile PEP (665 successor) look like?

flying-sheep · July 26, 2022, 8:54am

Continuing the discussion from PEP 665, take 2 – A file format to list Python dependencies for reproducibility of an application:

I wonder how we can get unstuck here. It seems like the main rejection reason was that it rejected the idea “Allowing Source Distributions and Source Trees to be an Opt-In, Supported File Format”

Due to [implementation complexity and lack of reproducibility], it was decided it was best to have a separate discussion about what supporting sdists and source trees after this PEP is accepted/rejected. As the proposed file format is versioned, introducing sdists and source tree support in a later PEP is doable.

Since the PEP was rejected, it seems like that discussion should happen now, right?

pradyunsg · July 26, 2022, 9:19am

I was looking at this issue yesterday, to see what we should do about it. See:

FWIW, I’m starting to feel like it would make sense to do a one-shot approach for designing the lockfile format since the standardisation process for this, is going to be really slow.

There’s two areas where improvements could be made over PEP 665 in a follow up are – support for source distributions and a clearer story for multi-platform locking. Beyond that, I think we need to (a) clarify the user workflows and how the lockfile fits within it and (b) establish the gradient of support across tooling and design to ensure that the user experience is still good.

This is basically a complicated design problem and a whole bunch of talking to people to establish where things should end up, before we even get to the point of being a publicly posted PEP draft.

Picking this back up is on my TODO list, but there’s a single-digit number of items before it and some of them have a similarly large scope. If someone wants to start chipping away at this before I can, I welcome that and would be happy to help (assuming that help is welcome, of course).

ofek · July 26, 2022, 2:05pm

This is my next focus for Hatch. I’ve been thinking about it every night for 2 weeks and I can’t find a great way to achieve cross-platform support, which affects me personally because I’m a full time Windows user.

The problem isn’t intractable, but it certainly feels that way so far.

brettcannon · July 26, 2022, 9:00pm

I’m stilling planning to do this in mousebender. Getting packaging.metadata parsing METADATA files is the next step in this work.

I still don’t want to touch that in a v1 as it is an inherent compromise in environment reproducibility.

Got an idea (What information is needed to choose the right dependency (file) for a platform? is part of it).

I have a design in my head that would allow for different lock files for different platforms. The steps necessary to get there are:

packaging.metadata to read METADATA files
Resolver using resolvelib that only considers wheels
Tool to generate the lock file(s)
Tool to install from the lock file(s)

Steps 2 - 3 I’m assuming will be in mousebender since they are somewhat specific, although I do plan to have this all accessible via APIs.

It’s only intractable if you try to please all the people all the time. If you are willing to bound the use-case and how people will interact with the tool you can get a solution, but you have to choose the target use-case first (my design addresses app/cloud/deployment developers).

EpicWink · July 26, 2022, 9:44pm

Hey, that’s me!

Regarding sdist locking, in my case I think I can produce identical wheels from sdists because I lock and deploy using the same Docker image. The primary barrier is file modification times, which should be solvable. Is there anything I’ve forgotten?

ofek · July 27, 2022, 2:30am

I do too (like wheel names) but I consider that so suboptimal that I’m self-blocking.

brettcannon · July 27, 2022, 7:21pm

Why do you think that? It’s a concept that’s widely understood (e.g. people know that manylinux means the wheel installs on Linux). Now I’m not suggesting there shouldn’t be some smarts in calculating what the actual supported wheel tag triple is (I have an algorithm in my head to calculate the least specific, but still fully accurate tag as well as how to optimize for specificity in either direction), but I don’t think there’s anything wrong with leaning on that mechanism. You’re already using those tags to do installs without the lock files anyway.

brettcannon · July 27, 2022, 7:22pm

Assuming the wheels have any file paths stripped out, I don’t think there’s more to it than setting the modification time. @kushaldas might know of something that we are all forgetting.

FRidh · July 28, 2022, 11:22am

The way I interpret the PEP it provided for reproducible installations from existing built artifacts. but it did not offer anything for building for source and thus for reproducible builds. A reproducible installation is fine for those that can use pre-built binaries, but there are many distributors/organizations that cannot or do not want to use pre-built binaries. PEP 665 lock files would be unusable for them.

When taking an approach as done by poetry, locking versions and recordings all available artifacts, it becomes possible for third parties to choose which artifacts they want to use, based on their requirements. For example, poetry2nix uses source distributions by default, but there is an option to flip a bit and use wheels. That does not provide entirely for a reproducible installation (actually as poetry2nix is Nix it does, because we record the option we set), however, that could still be added as an extension to the lock file, e.g. a simple listing of files used when locking.

Thus, what I would prefer to see is a format similar to that used by poetry. And for those that want reproducible installations, add the list of artifacts used. It is up to tooling and the user to choose what they use. Whether they’re happy with having the same versions, or whether they want the same artifacts.

The real tricky part with the locking is what to do with build systems. A package used as both build tool and during run time don’t have to be the same version. Also, different packages can be built with different versions of build tools. Though, they might have to be the same in some cases (or at least within a certain version range, thinking numpy/scipy here). Of course, when considering only a reproducible installation this whole part is irrelevant, and that greatly simplifies things.

pf_moore · July 28, 2022, 12:42pm

Correct, and that was absolutely explicit. The PEP rejected supporting source trees, but that’s not directly why PEP 665 failed - I’d have quite happily approved a “binary only” PEP.

The key problems with PEP 665 only supporting binary artifacts (speaking as the PEP delegate on that PEP) were:

A significant proportion of the community said that a lockfile without source support was useless for them. That reduced the benefits of the PEP fairly significantly (people suggesting that a “version 2” lockfile with source support would be useful to them were still people who wouldn’t benefit from PEP 665 lockfiles).
In a world where a non-trivial number of packages are only distributed in source form, I didn’t feel PEP 665 explained clearly enough how the recipient of a lockfile was expected to identify, obtain, and make available the required binary artifacts. It’s possible there was a fairly simple answer to that, but it would have reduced the number of applicable use cases still further (that’s just speculation on my part, though, based on the fact that I never got a clear answer on the question of how lockfiles could be used to deploy apps to providers like Heroku).

I maintain that any new lockfile PEP needs to be very careful to pin down exactly what use cases it’s intending to address - and stick very closely to that scope. The term lockfile means many things to many people, and it will be critical to not let the discussion drift into expectations that don’t match the declared scope, otherwise we’ll just have yet another debate that leads nowhere.

Conversely, I think that proposals with a wide scope (and I include anything that supports source distributions here) will have a lot of difficulty, purely because of the realities of some of the legacy edge cases in the packaging ecosystem. For example, any proposal that wants both reproducible installs and source support needs to address the fact that we have no way of enforcing that a project build is reproducible (consider a setuptools-based project that calls random.random() in its setup.py). We can, of course, declare that we won’t support such edge cases - but making such an exclusion workable in a way that’s anything more than “assume it’ll be OK and ignore it” would be very hard. Static sdist metadata might help a little here, but we’re still a long way from a situation where that’s common enough to rely on it.

Precisely. Which is why I believe that “binary artifact only” lockfile proposals are the ones with the most chance to succeed in the near term. The next most likely are proposals that accept some level of weakening of the idea of installs being reproducible. But reproducible builds are a long way away, and there’s a lot of cats to be herded on the way there

FRidh · July 28, 2022, 1:57pm

Absolutely!

I agree it is most likely to succeed in the near term, and then likely gets adopted widely pretty quickly. However, I think having such a reproducible installations lock file will weaken efforts that try to achieve reproducible builds.

Speaking from a Nix user point of view, I can recommend projects to use poetry with poetry2nix as we can achieve reproducible builds this way. Users of only poetry can’t, simply because there is much more information needed to get there. Realistically speaking, this will never happen with language-specific tooling. Still, projects would often be happy to use poetry and its lockfile support as it provides them the feature they need.

Now, a project might start adopting the new reproducible installations lock file format. The project likely will be happy with it, as again, it provides the feature they need. It will be tough however getting such project to switch or also adopt something that actually allows for reproducible builds. I don’t think many users/developers go so deep into understanding these parts. Therefore, I think having a reproducible installations lock file will actually be detrimental to the efforts for getting reproducible builds more widely adopted.

pf_moore · July 28, 2022, 2:22pm

Maybe. But people are actively putting time and effort into reproducible installs via lockfiles, because they are aware of users for whom that will address a real need. I don’t think it’s reasonable to say that we won’t do that simply because at some point in the future, someone might want to work on making it possible to get reproducible builds. By that logic, we shouldn’t have developed setuptools, because the “run code as part of the build process” mechanism is actively harmful to the idea of reproducible builds.

Frankly, if reproducible builds aren’t attractive to projects on their own merits (rather than as a means to an end that can be achieved in other ways) then I don’t think we should be asking those projects to switch.

johnthagen · July 29, 2022, 12:22am

We use Poetry for cross-platform locking and it’s been a really great improvement over pip-tools which expects you to manually lock different platforms yourself. I’m sure there are probably some edge cases, but we’ve had some complex situations that Poetry handled perfectly (e.g. requiring different versions or packages based on platform and architecture).

The key to it’s success, I think, is that at install time Poetry has access to both pyproject.toml, where your top level constraints are located, and the lockfile, which locks each version with all wheels/sdist. This allows it to do the right thing by combining both of those pieces of information when running on the current platform.

brettcannon · July 29, 2022, 6:35pm

Or put another way, working towards reproducible builds of wheel files can be done independently of any lock file format that is wheels-oriented. So if someone wants to start a separate discussion focused on reproducible builds of wheel files we can start that conversation.

ofek · July 30, 2022, 12:05am

Sorry, what is the distinction we’re talking about?

FRidh · July 30, 2022, 6:04am

And it’s great to see efforts happening. But reproducible builds are possible already, just not with the Python-only packaging tools. And reproducible builds of not just packages but also environments can already be done with the right kind of lock file. Just not with Python-only tooling.

For many people reproducible builds probably won’t sound so important. Just as packaging once wasn’t important either (just make && make install everything). One cannot expect all developers to fully grasp these concepts either. However, I do think it is the responsibility of those designing standards to consider these aspects considering the impact they have.

To be clear I wasn’t talking about reproducible builds of Python packages using Python-specific tooling. I am talking here about hindering reproducible build of environments using external tooling that can already create reproducible builds.

I’m not sure where the disconnect is happening. We have the following concepts:

reproducible build of a wheel
reproducible installation of an environment
reproducible installation of an environment using reproducibly built wheels

The ultimate goal is 3).

Python tooling by itself is not sufficiently strict to yield 1) and hence also not 3), however, with external tooling such as Nix it is possible to achieve 1).

PEP 665 was about 2), the reproducible installation of an environment from specific wheels through the use of a lock file.

With poetry2nix it is possible to combine Python and Nix to already achieve 3) as well. This is possible because poetry locks on versions, storing all possible artifacts in the lock file. PEP 665 does not do this, and hence, it is not possible to combine say Nix and Python to achieve 3).

I think introducing a PEP 665 like lock file will make it more difficult to get projects to adopt a poetry like file format, as that would then compete with a standardized lock file format, and hence making it harder for our Nix+Python community to maintain having 3).

pf_moore · July 30, 2022, 8:54am

I agree, but probably not in the sense that you hope My view is that as a community based standardisation process, it is the community members’ responsibility to present examples and arguments that guide the production of standards. And as a volunteer community it is our responsibility to focus on incremental, achievable changes, and not attempt over-ambitious “all or nothing” changes that cannot realistically be achieved with the resources we have.

In this case, that means I’m happy if you want to propose an alternative PEP that covers reproducible builds. But the responsibility would be on you and anyone else in the community who supports your proposal to present the arguments and address the concerns. And I’m raising a concern - your proposal seems too large and currently under-specified to be achievable, compared to a PEP 665 successor that focuses on reproducible installs only. So far your response has been that we should reject other such proposals, to give the larger one time to develop. My reply is that I’d rather see clearly defined benefits now than unclear, possibly no greater, benefits at an unspecified time in the future. I’m happy to continue the discussion, but you’re not going to convince me without something more concrete. And to be explicit, I’m not even sold on the benefits of reproducible builds - I don’t expect to ever audit some source code, do a build, and confirm that it’s bit for bit identical to a reference binary just so I can trust the reference binary. So any argument for a reproducible build based process can’t start from the assumption that everyone agrees that reproducible builds are something that we want. The practical benefits need to be explained, not some abstract “it’s good for you” argument.

So propose a poetry like format, whatever that is (I genuinely don’t know). That sounds like an entirely fair proposal - “here’s a format, here are the benefits, it’s similar to one in real world use so it has a track record”. If the arguments are good enough, it can be accepted. But blocking other proposals because no-one is willing to make the effort to write up this proposal isn’t going to happen. That’s what community based standards mean to me, at least - community members need to actually do the work.

FRidh · July 30, 2022, 12:26pm

I’ve started a new PEP at peps/pep-0697.rst at pep-0697 · FRidh/peps · GitHub. It is largely based on PEP 665 as it is well written and many parts still apply.

In short, the proposal is a version-based and file-based format. Version-based, so it allows for source builds and, using third party tooling, reproducible builds and installations. File-based, so it allows for reproducible installation a la PEP 665 from pre-built installables, primarily when it is not possible to use third-party tooling for reproducible builds.

I hope that as it is it already clarifies a bit more but clearly it’s going to need a lot more work.

CAM-Gerlach · July 30, 2022, 4:01pm

Just a minor procedural sidenote—starting a new pre-PEP on your fork, usually with a new dedicated thread in this Discourse category to discuss and refine it until it is ready for submission as a formal draft PEP, is the way to go. However, do note we cannot necessarily allocate it a particular PEP number until it is at the stage of a formal draft PEP, is submitted as a PR and passes a preliminary round of editorial review, so by convention PEPs before that stage are named pep-9999.rst and renamed once the PR is ready to merge.

EDIT: And just to be clear, it we’re definitely not at the formal draft PEP submission stage yet.

pf_moore · July 31, 2022, 2:02pm

I’m trying to write up some thoughts on what might make a good PEP. But I keep hitting some extremely fundamental questions, and I think that we probably need to agree on those first, before we’ll make any progress going into details. So apologies if this seems a bit philosophical or abstract, but I think it’s worth exploring.

What are we even trying to lock?

As I try to describe my concerns, I keep hitting the idea “I’m trying to write a lockfile for…” For what, exactly? Some people talk in terms of locking an application (for deployment, for example). Others talk about locking an environment (to clone development environments, for example). The two ideas are closely linked, but are fundamentally different. And while it’s possible that a lockfile proposal could help in both cases, I think discussions would be enormously improved if we could separate the two ideas. And maybe agree on some terms to keep the distinction clear.

Is a lockfile part of a deployment, or all of it?

People seem to have different ideas about how “standalone” a lockfile is meant to be. The core example here is locking a dependency on a package that’s not publicly available. There’s clearly no way that a lockfile can work unless the recipient can get access to the specified binary. So the question is, are questions about “where do the referenced files come from” part of the standard, or are they out of scope? Any form of direct URL or file path included in a lockfile carries an implication that “where the files come from” is in scope. Even a URL to a public site like PyPI prompts questions about “if I have no internet access, how do I tell the installer to get the artifacts from somewhere else?” Conversely, an unqualified name of a private package prompts the question “what use is the lockfile if it can’t be used without out of band knowledge of how to configure the installer?”

What the heck is a lockfile anyway?

I’m sympathetic to the idea that lockfiles are only part of the solution, and should be viewed in the context of tools that use them. After all, files like pyproject.toml are defined in this way. The difference is that the packaging community is very familiar with the context there (pip as the installer, setuptools as a build backend, PyPI as an index…). But I’m not convinced there’s the same “shared community understanding” of the context for lockfiles. Some people have used pipenv and its lockfiles, some have used poetry, etc. But they don’t seem to match the expectations of reproducibility that lockfile PEPs are converging on (whether it’s build reproducibility or install reproducibility). In particular, the existing lockfiles I’m aware of all support building from source, and happily accept that source builds aren’t strictly reproducible (and in some cases may be very far from consistent, especially across environments). So everyone is confused because the shared understanding from pipenv/poetry etc doesn’t seem to apply. And no-one can relate lockfile PEPs to any real-world context, because they aren’t a complete alternative to what people are used to thinking of as lockfiles.

Where do we go from here?

So where does that leave us? I think that anyone considering writing a lockfile PEP needs to resolve the three questions above, as a minimum. That means getting the community to agree on a shared set of answers. Simply writing a PEP that says “when I say lockfile, this is what I mean” won’t work - that’s essentially what PEP 665 tried to do. It might be possible to invent a whole new set of terminology, and distance the proposal completely from the idea of a “lockfile”, but I fear that would simply confuse things even further.

I’m not planning on writing a lockfile PEP, and I don’t have a pressing need for lockfiles myself, so I’ll leave this here for now. Thanks to anyone who got this far, and I hope it’s useful for something beyond merely getting things off my chest!