I won’t propose something that doesn’t support a lock format which you can get a flat list of files to install once you know the set of files applies to your environment.
I believe this is what PDM does for their lock files.
To me, I think “partial lock file” to mean “at worst you have to evaluate markers (and maybe wheel tags) to decide what to install”, and thus a resolver is still not necessary.
The failure case is how I’m choosing to interpret it.
Should there be consideration for pyproject.lock?
This was brought up as part of PEP 665 and decided against (remember, I’ve already done this whole discussion once before
). Basically not using .toml as the file extension causes the usual tooling headaches of things not knowing how to interpret the results. I also don’t know if PDM or Poetry expect their .lock files to be manually inspected and thus care about the file extension empowering that use case.
I’m still unsure how lockers are supposed to generate the markers and tags for a given lock entry
I’ll talk about that below, but it depends on how much effort you want to put into making a lock be as broadly applicable while still being accurate.
nor how installers are supposed to choose the “right” entry
I’ll also talk about this below, but it depends on whether we would allow multiple locks to ever apply (i.e. require they be mutually exclusive).
That also means we need to explicitly document how to decide the degree of speciality of a particular tag and marker.
If we allow overlap then I think you’re right.
I like the idea of not needing a resolver at install time. It makes installing from a lockfile easier to reason about, easier to audit, and just fundamentally simpler. But it does imply that the locker needs to be a lot more intelligent about working out “what platforms does this solve apply to” if we’re to avoid overspecified targets. And as far as I know, no-one’s even looked at that problem yet.
I’ve been thinking about it and I’ll talk about it below.
The reason that the current proposal doesn’t record the dependencies themselves is surely that it has absolutely no need of them.
Correct. Listing dependencies per distribution is just to help infer why some distribution was included.
there don’t seem to be any existing tools that try to do this kind of lock.
I would argue pip-tools does, but within the confines of requirement files and the info they are given by pip’s resolver.
whether or not a resolving installer is needed for lockfiles, it will still be needed in general to do “regular” (i.e., not locked) installs. So it seems like not requiring a resolver for lockfile installation means everyone will still be using a resolving installer, but they will have the additional option of keeping an extra, weaker (i.e., non-resolving) installer around just to install lockfiles. That doesn’t seem like a real huge benefit to me.
Just because no one has put in the effort doesn’t mean people wouldn’t use something. I very much expect my work to use such an installer (otherwise I wouldn’t be doing this). I also suspect some people see a file w/ a .lock extension and assume e.g. Poetry has a stricter lock than it does.
A resolving installer would behave differently (this is the Poetry/PDM use case):
- identify all defined target environments that match the current environment
I don’t think we would need to go that far, but I’ll go into more detail below.
While not technically non-trivial, I think it would be pretty straightforward to compute the most restrictive tags and the union of all markers for all dependency specifiers of all files in a lock entry. This would allow all environments to those specific constraints to install from that lock entry. I think Brett alluded to this before.
I agree that I don’t think propagating e.g. markers through the resulting dependency graph to know what is important in a lock is insurmountable (the PoC I did for Charlie already does this).
When I’m auditing, I don’t want to open the file to scroll up 2000+ lines to see if a file is under
[[common]]or[[conditional]]. Perhaps instead, you have a[[files]]array and put target-names in each file, and either leavetargetsunspecified or set it to a sentinel (eg"*") for the formerly common files.
I already had that in my head in that way. ![]()
I wonder how much of this current topic is about being able to reverse engineer the original input to the resolver? I don’t see any problem with a lock file representing “the specific files implied by these versions at this time for this platform,” in which case none of the markers or dependency information is really needed.
At least my proposal isn’t. Originally I just recorded all of the inputs. I suspect if I come up w/ an alternative it would still capture what’s pertinent.
Lock file situations
Here’s a thought - if we had a corpus of “typical lock specifications”, that might help us understand better whether the relatively complex cases we’re worrying about here actually happen in practice.
Over the weekend I thought about this a bit and the sort of scenarios one could find themselves in when trying to create a lock file.
Flat list of files
This is the “boring” case where there are no environment markers for a dependency, everything has a wheel, and you only care about one OS, CPU, and Python version. That’s just a list of files where you need to make sure the wheel files apply to your environment, maybe requires-python all-up if you happen to only have pure Python wheels.
Marker on a dependency
An easy way to look at this example is Rich because it has a conditional dependency on typing-extensions; python_version<3.9.
The way Poetry handles this is it captures the requirements of Rich directly so they can resolve the requirement.
Poetry's lock on Rich
[[package]]
name = "rich"
version = "13.7.1"
description = "Render rich text, tables, progress bars, syntax highlighting, markdown and more to the terminal"
optional = false
python-versions = ">=3.7.0"
files = [
{file = "rich-13.7.1-py3-none-any.whl", hash = "sha256:4edbae314f59eb482f54e9e30bf00d33350aaa94f4bfcd4e9e3110e64d0d7222"},
{file = "rich-13.7.1.tar.gz", hash = "sha256:9be308cb1fe2f1f57d67ce99e95af38a1e2bc71ad9813b0e247cf7ffbcc3a432"},
]
[package.dependencies]
markdown-it-py = ">=2.2.0"
pygments = ">=2.13.0,<3.0.0"
typing-extensions = {version = ">=4.0.0,<5.0", markers = "python_version < \"3.9\""}
[package.extras]
jupyter = ["ipywidgets (>=7.5.1,<9)"]
PDM, though, records the marker requirement via both the requirement from Rich and typing-extensions itself.
PDM for Rich and typing-extensions
[[package]]
name = "rich"
version = "13.7.1"
requires_python = ">=3.7.0"
summary = "Render rich text, tables, progress bars, syntax highlighting, markdown and more to the terminal"
groups = ["default"]
dependencies = [
"markdown-it-py>=2.2.0",
"pygments<3.0.0,>=2.13.0",
"typing-extensions<5.0,>=4.0.0; python_version < \"3.9\"",
]
files = [
{file = "rich-13.7.1-py3-none-any.whl", hash = "sha256:4edbae314f59eb482f54e9e30bf00d33350aaa94f4bfcd4e9e3110e64d0d7222"},
{file = "rich-13.7.1.tar.gz", hash = "sha256:9be308cb1fe2f1f57d67ce99e95af38a1e2bc71ad9813b0e247cf7ffbcc3a432"},
]
[[package]]
name = "typing-extensions"
version = "4.10.0"
requires_python = ">=3.8"
summary = "Backported and Experimental Type Hints for Python 3.8+"
groups = ["default"]
marker = "python_version < \"3.9\""
files = [
{file = "typing_extensions-4.10.0-py3-none-any.whl", hash = "sha256:69b1a937c3a517342112fb4c6df7e72fc39a38e7891a5730ed4985b5214b5475"},
{file = "typing_extensions-4.10.0.tar.gz", hash = "sha256:b0abd7c89e8fb96f98db18d86106ff1d90ab692004eb746cf6eda2682f91b3cb"},
]
In this situation there’s an optional distribution to install, but it’s a yes/no thing. So if you locked this to requires-python as >=3.8 (due to the minimum Python versions of some dependencies), you could get a “partial” lock where you scan all the files and decide whether to include or distro based on the environment marker(s) that influenced the inclusion of that distro (i.e. if a distro was conditional then its dependencies should be conditional as well unless some unconditional requirement pulled it in). And since markers support and and or you should be able to compose a marker that accurately captured the environment requirements which led to a distro being needed. And if you propagate those markers through the dependency graph I think you can still end up w/ a flat list of files to consider.
Multiple options work
This comes up w/ any distribution which has binary wheels and a pure Python fallback. An example of this is debugpy. If this was your sole dependency, what wheel do you choose (I think this is @charliermarsh 's question about what is an installer to do in the face of ambiguity?). Installers currently have a hierarchy to wheel tags for a platform, so this single case is easy to resolve: choose the “best” wheel.
But what happens when you scale this up to multiple distributions which might not cleanly all have binary wheels for your platform? My PoC I did for @charliermarsh calculated an overall weight/preference average for all of the wheel files and chose the one w/ the “best” weight. Unfortunately there isn’t a clean way via markers to say what e.g. glibc version you are on and so this pure Python lock shouldn’t apply it you can match this wheel tag, although we could introduce such wheel tag block list to prevent ambiguity.
Conflicts across environments.
Think:
packaging>=23.0; os_name=="nt"
packaging<=21.0; os_name=="posix"
This is what Poetry can handle and requires a resolver unless the possibilities are small enough it isn’t a bit deal to just generate all lock file possibilities eagerly (i.e. the concern over a combinatorial explosion of environment locks). PDM didn’t handle this and just locked to packaging 21.0 on my machine even when I asked for a cross-platform lock file.
The range of specificity
Here is the range of environment locking specificity from most to least specific:
- If you match the wheel tags, Python version requirements, and environment markers, just install the list of files.
- If you match the wheel tags, Python version requirements, and environment markers, go through the list of files and evaluate the markers per-file/distribution to decide whether to install it (the Rich case); this is still a linear install (i.e. no resolver)
- If you match the wheel tags, Python version requirements, and environment markers, go through the list of distributions and choose the best file to install (the debugpy case)
- Do a full resolve based on what’s in the file
Now in scenarios 1-3 all depend on whether lockers can generate appropriate checks and how specific we are okay w/ them being. This could be extra specific (e.g. <marker> == <value> for all environment markers) to putting some effort to try and minimize the requirements (e.g. only list environment markers used in the resolution). @radoering has a gut feeling that just environment markers
I think it really comes down to how much flexibility do we want to allow for so that lockers even have a chance to generate a lock that is as broad as possible while still being accurate? Going back to the Rich example, how would you expect it to handle the change in requirements that happens at Python 3.9 and newer?
- Have two environment lock specifications/headers that differ by Python version, marking the similar files as applying to both environment locks
- Have one environment lock specification overall and have the files/distributions be evaluated to decide whether they apply for Python<=3.8 case
I’m not sure which opens more possibilities, which helps w/ auditing more, or which is demonstrably easier to implement.