This was one of the motivations for the creation timestamp, but people very much didn’t like having such a value in the file.
OK, here’s where my mind is at.
Other lock files
I want to outline what Poetry and PDM have in their lock files to point out the differences and similarities (I’m skipping pip-tools since the list is short: name, file hashes, and dependents). That way we can perhaps agree on something that either meets multiple scenarios upfront or at least is flexible enough to grow in the future.
Formats
Everything in bold is unique to a format.
Poetry
/cc @radoering to check my work.
- Metadata
- Version
- Python version support
- Content hash
- Package
- Name
- Version
- Description
- Optional
- Python version support
- Files
- File name
- hash
- Dependencies
- Extras
PDM
/cc @frostming to check my work.
- Metadata
- Groups
- Strategy
- Version
- Content hash
- Package
- Name
- Version
- Python version support
- Description
- Groups
- Dependencies
- Marker
- Files
- File name
- hash
Commonality
- Metadata
- Version
- Content hash
- Package
- Name
- Version
- Python version support
- Description
- Dependencies
- Files
- File name
- hash
Questions/Observations
People didn’t like having a hash for the file, but both file formats do.
There is only a single hash value per file. Indexes can actually provide an arbitrary set of hashes. Are people okay w/ just one hash value, or would they rather capture all possible hash values (i.e. do lockers get to dictate the used hash, or do installers choose from the options presented to them)?
Poetry gathers the maximal Python version support in a single place in its metadata.
PDM specifies the marker expression which must be true for a package to be installed on the package itself as well as the raw dependency specifier on the package(s). I’m assuming the former is to allow for linear reading of the lock file for installation while the latter is for documentation purposes.
PDM seems to only specify a single version of a package when locking, but that isn’t inherent to the format.
Why the inclusion of a projects’ description? To help remember what something is for?
Both use the name "package"and not “distribution” or something else more generic.
Both seem to only extract dependencies from a single file and apply it for the overall package version. That’s technically incorrect since every file can have its own unique metadata (both from an sdist/wheel perspective, but between wheels as well). In practice I don’t think most projects tweak per-file and simply rely on markers in order to have a single set of dependencies. I personally think that’s fine, but following this practice would enshrine the expectation in a PEP.
Potential scenarios to support
Environment lock
This is listing the exact files to install if an environment met a specific set of conditions. The conditions would probably a marker expression (support for and
and or
along w/ a way to invert any operator makes composition possible) and a set of wheel tags.
This is what I initially proposed and covers the case where your environments are finite and known ahead of time (i.e. dev and prod where you’re using a specific Python version).
Package (version) constraints
You list a version(s) of a package and an installer decides to install that package version independently. Whether the package version is considered an optional install is whether a marker expression on the package is met, else it’s skipped. The absence of a marker means it’s required. You would list all files available for that package version and the installer chooses which file to install.
If you restrict this to a single version of a package I believe this is what PDM does (@frostming is that right?). This would help cover the case where the concern over combinatorial explosion comes in, along w/ the “we don’t restrict OS or Python version” case that @charliermarsh brought up from a previous job. I would also hope this covers the open source case of “we want to all agree on what to install to build our docs” case that @rgommers brought up.
Shrunken worldview
All package details are included in the file and acts like a restricted view of the world. This would require a resolver to figure out what to install.
This is what Poetry does.
What I’m currently thinking
I obviously want to support the environment lock scenario. I think the package constraints scenario could also be supported as long as we come up w/ a nice way to signal the file is to be interpreted in such a fashion (probably via a “strategy” or “scope” key at the top-level of the file or by the existence of either a [[env-lock]]
or [constraints]
table, but not both).
I think expanding out to the shrunken worldview is a future PEP thing (if people ultimately want it), but making sure no decision is made that would prevent it would be good. And w/ @radoering having suggested that marker-only inclusion/exclusion decisions might be enough for Poetry someday, I don’t think leaving out this scenario automatically excludes Poetry.
If this makes sense to people then I think we would be after something like what PDM provides plus some stuff for the environment lock scenario. This is what I’m currently thinking we should aim for (unless you all come forward and say I’m misreading what you all are after ).
My free time is about to evaporate
In case people have missed my public statements on this, my wife is pregnant and due later this month, which means my free time will suddenly disappear anytime between now and the end of the month. As such, I can’t make any promises on getting a PEP written before June or participate in conversations going forward. But assuming someone doesn’t write a PEP in my absence, hopefully we can agree on the scope of a future PEP now and I can plan to write one when I’m at least back at work in June.