PEP 665, take 2 -- A file format to list Python dependencies for reproducibility of an application

Let me approach the topic from another viewpoint: is there a use case for which Poetry lock files are not good enough? If so, what would be missing? And would that use case be something we need to care about in the scope of this PEP?

Not sure if it’s “not good enough”, but Poetry lock files do not support a linear installation of anything. Since their lock file records dependency information in order to be a single lock file for all platforms with a single list of packages, you still have to perform a resolution to figure out what to install (either at the package, version, or file level). I think this might be why some switch off of Poetry as their resolution algorithm takes longer than some people want to wait in order to support that “universal” lock file.

Could you clarify what “linear installation” means? If Poetry didn’t record all that information, then wouldn’t the lock file be specific to a particular environment (OS, arch, libc version) and set of extras? Is that what we want?

I can add to Brett’s (rather helpful) list of use cases.

Environments that continue to work over time. In the past, installing the scipy stack (numpy, pandas, scipy, jupyter, etc.) involved creating environments with many dependencies. I’ve observed such large environments fail to reproduce (usually from a dependency resolution conflict) even under the following pre-cautions:

  • pinned pip requirements.txt file
  • pinned conda environment.yml file
  • pinned, conda spec-files
  • same machine w/o updates

Utimately, I’ve had moderate success with hashed, pip-tools-style requirements.txt files in combination with the latter techniques. I’d prefer to see a more standardized solution that would help this type of “reproduction” dilemma, i.e. guarantee successful re-creation of a pre-existing environment.

1 Like

Which points to a reasonable trade off [1]


Yep. I’d be fine with having a format which allows me to turn the knob in one direction or the other, with sufficient UX to give me the feedback I need to know if the package manager isn’t able to do what I want.

E.g. If I want near perfect reproducibility [2] then I’ll specify wheels and the package manager should refuse to install sdists, or warn if it can’t resolve everything through wheels, or just fall back to “close enough” reproducibility if it has to resolve via sdists.


  1. to me at least ↩

  2. it’s probably impossible to get bit-for-bit checksum equal reproducibility without more infra that’s out of scope here ↩

An installer can look at a list of packages and just install what it sees without any thought of what it should be installing.

Depends on what you expect from a lock file. If you’re okay with both lock file generation and consumption to require logic to calculate what needs to be installed, then what Poetry does is fine (from what I can tell, but I’m not expert at its abilities). But if you want installation to be simpler and more reproducible then you probably don’t want decisions made at install time in terms of a decision tree of what should get installed and instead have a “dumb” installer that just installs a list of wheels. In that instance then the lock file would be specific to the wheel tags and markers used when the lock file was generated, but I’m personally fine with that as long as you can have multiple instances of that to cover each platform you care about.

As Paul has said, simply locking down what you’re after is a massive part of this entire endeavour. :sweat_smile:

1 Like

If you had this many forms of pinned requirements, what exactly went wrong then? This information would be very helpful in determining how to design the spec.

This came up last time - from the consumer side, a key requirement should be that it’s possible to write a tool that reads a lockfile and reports what would be installed, without that tool needing to implement a resolver.

This effectively:

I’ve said it before but if this invariant is clarified / enforced (even if only via a standard and not directly in PyPI), I believe that it unlocks a huge amount of value and simplification towards a potential cross-platform lockfile.

Relevant bits of zen:

Special cases aren't special enough to break the rules.
Although practicality beats purity.

Yes, I agree actually. My language got awkward and I’m going to blame my previous commentary about “reproducibility”. Every time “reproducibility” is brought up (via the nix crowd or otherwise), nerd-sniping occurs about how far into the turing tarpit do we go, or byte-for-byte reproducibility isn’t practical / possible etc.

I was mostly responding to that and trying to include that in my MUST/MUST NOT proposal.

I think it was a mistake in hind-sight. I think true “reproducibility” (in a byte-for-byte sense) is not what this PEP should be targeting as a goal or benefit. Having a lockfile will improve the probability of being able to reproduce an environment, but it should not be guaranteed at all.

If I feel motivated again, I might try spike out a simpler set of MUST / MUST NOT for my lockfile metaphor where it’s a way to specify a virtual wheelhouse (or PyPI / index allowlist) that can be used to install build and runtime dependencies into an environment, with only loose gaurantees that it will succeed.

This is just PEP 643, surely? If the sdist declares the dependencies as not-dynamic, that guarantee is what you get.

Possibly, but the benefit of not allowing it to be dynamic (or at least not varying across platforms), is that lockfiles can be generated on one platform and installed on other platforms. This is the not-technically-correct-but-very-practical simplifying assumption that poetry presently makes with its lockfile.

This addresses the universal pain felt around the world by engineers using windows or macos and having to commit lockfiles that run on linux CI or linux production servers. And here is where I expect folks to start yelling “docker, docker” at me. We don’t all want to (or need to) use docker for something like this.

1 Like

This is my experience as well. Currently, there is no better way to reproduce other than caching whatever happened to be installed on your own infra.

Hatchling documents the unchanging value it uses Build - Hatch

Cool. So my follow up question is what’s wrong with that approach? I’m trying to pinpoint what problem we’re attempting to solve here


Apologies if I’m completely missing the point here. This isn’t a use case I personally have any need for, so I’m somewhat blind when it comes to understanding it.

1 Like

I agree with this. I think the PEP currently has some of this in the “motivation” section but it’s not completely clear (at least to me) why the different motivations outlined there were synthesized into the eventual proposal in this particular way.

In my own experience, when I hear people wanting lockfiles, it is mainly as a tool for deployment and perhaps a bit for reproducibility of a dev environment. People want to set up something like a web app and then say “I know everything works with these versions of these packages, so I want to lock that so I can install exactly these versions again later and thus know my app will work”. And maybe sometimes they want to do something like “I’m developing this, I want to lock my dev environment so I can send it to a coworker working on related stuff, to be sure we’re both working in compatible environments.” (This is in some sense still deployment, just deployment of in-progress rather than finished work.)

From that perspective, it seems to me that simple locking based only on declared package versions would go a long way towards meeting people’s lockfile needs. If you lock your environment and reinstalling the same version of the same Package X later leads to different behavior, then it’s either a bug in Package X or it’s a result of some Python-external difference that’s out of scope.

Cross-platform locking and strict security concerns (like ensuring byte-for-byte identity) are definitely relevant, but it seems like they open up huge new worlds of complexity.

In the end a lot of what people want from lockfiles (again, just as I see it) has to do with the behavior of code — they want some kind of guarantee to the effect of “I can get this code to run correctly again”. That conceptual definition may be difficult to achieve, but the current best proxy we have for “the code works the same” is version numbers.[1]


  1. Well, actually maybe the current best proxy is tests, but those are a lot more work :slight_smile: and could be bootstrapped by version-based lockfiles. ↩

4 Likes

I personally hope we can standardize on a default value of Python’s first public release (I have that Unix timestamp written down somewhere).

Nothing’s wrong with it just like everything that functions without a standard isn’t “wrong”. But the fact everyone is re-implementing the same concept feels wrong.

1 Like

I’ll also toss in that there’s a security-in-depth issue that’s missing with this approach. If the only protection mechanism you have is “only use our package index and don’t accidentally hit another index like PyPI” then you’re one configuration mistake away from getting a wheel you didn’t expect or want. But if you at least make things like hash validation an opt-out thing and something baked into a standard then you have another layer at installation-time to make sure you are only installing what you intend to install.

Going back to PEP 665, take 2 -- A file format to list Python dependencies for reproducibility of an application - #177 by brettcannon and my motivations for why I made PEP 665 the way I did, the security point also extends to cloud hosting scenarios where you are doing installs on other machines that might be automating installations. In those scenarios you also want to watch out for configuration mishaps, especially when you might not directly control how the installation tool is invoked (e.g. if they are using pip on your behalf and they do not let you specify e.g., --require-hashes). If a tool was following a lock file standard that said, “you must validate hashes (when present)” then you don’t have to be so concerned about how the tool is executed to get the features you want turned on (and yes, you could argue the cloud host should provide the right knobs, but we all know vendors don’t necessarily give you what you want).

4 Likes

So much for this. Security should be default and also in layers.

1 Like

A few more things that I discovered while working on repairwheel:

  • Ordering of files in the archive. The wheel spec suggests placing .dist-info at the end, but doesn’t seem to say anything about writing files in any sort of sorted order. I found it convenient to place all non-dist-info files at the front sorted lexicographically, then all dist-info files except for RECORD, and finally RECORD at the very end.
  • Ordering of files in RECORD. Should probably match the order of files in the archive described above
  • Newlines in meta files. Use \n, even when building wheels on Windows.
  • Setting the ZipInfo.create_system value which differs by default on Windows vs. unix.
1 Like

Hatchling does this: hatch/backend/src/hatchling/builders/wheel.py at hatchling-v1.14.0 · pypa/hatch · GitHub

Hatchling does this except for licenses which are shipped unmodified: hatch/backend/src/hatchling/builders/wheel.py at hatchling-v1.14.0 · pypa/hatch · GitHub

I’m hesitant to modify licenses even for something as innocuous as new line normalization, though I am open to changing my mind on that!

Is there a concrete benefit to doing that? In my mind I view reproducibility as depending on a few inputs, one of which is the operating system.

1 Like