I’m trying to write up some thoughts on what might make a good PEP. But I keep hitting some extremely fundamental questions, and I think that we probably need to agree on those first, before we’ll make any progress going into details. So apologies if this seems a bit philosophical or abstract, but I think it’s worth exploring.
What are we even trying to lock?
As I try to describe my concerns, I keep hitting the idea “I’m trying to write a lockfile for…” For what, exactly? Some people talk in terms of locking an application (for deployment, for example). Others talk about locking an environment (to clone development environments, for example). The two ideas are closely linked, but are fundamentally different. And while it’s possible that a lockfile proposal could help in both cases, I think discussions would be enormously improved if we could separate the two ideas. And maybe agree on some terms to keep the distinction clear.
Is a lockfile part of a deployment, or all of it?
People seem to have different ideas about how “standalone” a lockfile is meant to be. The core example here is locking a dependency on a package that’s not publicly available. There’s clearly no way that a lockfile can work unless the recipient can get access to the specified binary. So the question is, are questions about “where do the referenced files come from” part of the standard, or are they out of scope? Any form of direct URL or file path included in a lockfile carries an implication that “where the files come from” is in scope. Even a URL to a public site like PyPI prompts questions about “if I have no internet access, how do I tell the installer to get the artifacts from somewhere else?” Conversely, an unqualified name of a private package prompts the question “what use is the lockfile if it can’t be used without out of band knowledge of how to configure the installer?”
What the heck is a lockfile anyway?
I’m sympathetic to the idea that lockfiles are only part of the solution, and should be viewed in the context of tools that use them. After all, files like pyproject.toml
are defined in this way. The difference is that the packaging community is very familiar with the context there (pip as the installer, setuptools as a build backend, PyPI as an index…). But I’m not convinced there’s the same “shared community understanding” of the context for lockfiles. Some people have used pipenv and its lockfiles, some have used poetry, etc. But they don’t seem to match the expectations of reproducibility that lockfile PEPs are converging on (whether it’s build reproducibility or install reproducibility). In particular, the existing lockfiles I’m aware of all support building from source, and happily accept that source builds aren’t strictly reproducible (and in some cases may be very far from consistent, especially across environments). So everyone is confused because the shared understanding from pipenv/poetry etc doesn’t seem to apply. And no-one can relate lockfile PEPs to any real-world context, because they aren’t a complete alternative to what people are used to thinking of as lockfiles.
Where do we go from here?
So where does that leave us? I think that anyone considering writing a lockfile PEP needs to resolve the three questions above, as a minimum. That means getting the community to agree on a shared set of answers. Simply writing a PEP that says “when I say lockfile, this is what I mean” won’t work - that’s essentially what PEP 665 tried to do. It might be possible to invent a whole new set of terminology, and distance the proposal completely from the idea of a “lockfile”, but I fear that would simply confuse things even further.
I’m not planning on writing a lockfile PEP, and I don’t have a pressing need for lockfiles myself, so I’ll leave this here for now. Thanks to anyone who got this far, and I hope it’s useful for something beyond merely getting things off my chest!