On this topic, the Motivation section of the PEP currently includes Dependabot as an example tool that might benefit from a lockfile standard.
However, the full Dependabot functionality (as opposed to only security alerts about vulnerable packages) will require it being able to update the lockfile, rather than just needing to read it. If package managers have any tool specific config/state stored in pyproject.toml / elsewhere, that will presumably get out of sync with the lockfile for anything other than simplistic lockfile changes. And in fact, it seems the Dependabot use-case would still need to perform full package resolution given that the new version of a package could change the dependency graph? And as such, Dependabot will still probably need to support/run all of the individual package managers anyway?
Are cross-tool lockfile updating use-cases ever going to be viable? If not, should the PEP motivation explicitly state that the lockfile is primarily aimed at read-only use-cases (such as package installation, or SBOM generation etc), and drop the mention of Dependabot?
What would you do if there was a [tool.xxx] section in the lockfile? If you error, then that effectively says that anything containing a tool section is non-portable. If you ignore it, that says that data in [tool.xxx] must not result in different files being installed. Either choice is valid, but they have different implications (and I think that the PEP should probably clarify what the intended behaviour is, even if itâs only as a SHOULD rather than a MUST).
I think that limited support (in the form of security alerts only) could still be useful, but Iâm not a heavy user of Dependabot so thatâs not an informed opinion. Certainly, if cross-tool updating isnât viable, that should be noted as a limitation for Dependabot. See below, though.
I would absolutely hope that if we have a standard lockfile format, and a given lockfile includes no tool-specific data (i.e., no [tool.xxx] sections) then any tool would be able to use it, both for installing from and for updating. If some capabilities are optional (for example, multi-environment locking), then tools that donât support than capability could reject the lockfile as unsupported - but as I said previously I think we should be very cautious about allowing optional capabilities if we want to claim this standard helps interoperability.
Sorry this is a terminology issue. By environment I was talking about the concept of Hatch environments not cross-platform environments because I was responding to what Charlie was talking about regarding workspaces. So, as an example, if there is an environment named foo with dependencies bar and workspace members ./w1 and ./w2 then it would have its own cross-platform (or whatever weâre calling it now) lock file even if another environment defined the exact same dependencies.
Does that make sense?
I donât think this will be useful to Hatch as I just mentioned but itâs possible I donât understand the current discussion.
My assumption is that any tool is able to consume the standard file. In the case of Hatch I donât actually have an immediate implementation in mind for the near future and am going to continue passing stuff to dependent installers like pip and UV.
This makes sense. The question we were talking about, though, is whether ./w1 and ./w2 both get their own lockfiles, and/or whether users are âallowedâ to sync their environment to just./w1 and its dependencies, or only âallowedâ to sync the entire workspace at once.
Concretely for a Pants user (where Pants uses Pex lockfiles) wanting to deploy an AWS Lambda function.
The user background:
The user has a monorepo using a single (Pex) lock. That single lock covers many binaries, libraries, tests & tools. Amongst all this code are a few cloud functions. In particular the user is focused on deploying one of these functions to AWS Lambda. This lambda function uses a small subset of the lock. Concretely, lets say the full lock for the repo is generated from input requirements ["foo", "bar[extra1,extra2]", "baz", "spam"], but the lambda function in question just imports from âbazâ, and âbazâ turns out to have a subgraph of âbaz 0.1â, âspam 0.2â and âinterior 0.3â.
The service provider background:
AWS Lambdaâs can be deployed in many ways. Two are:
code zip + requirements file
code zip containing requirements too
Note that style 1 is presumably dictated by todayâs (or maybe a bit yesterdayâs) de-facto standards. If a lock file format / semantic were standardized a way to deploy might be code zip + lock file bringing all the benefits of locked artifacts to method 1.
So, assuming AWS latches on to this standard and allows deploys via code zip + lock file the user in question has a problem if they want to use this method. The lock contains way more than they need - 100s of dependencies they do not use. This impacts their lambda deploy latencies at the very least. There are two ways to fix this afaict:
The user subsets their lock file producing a new lockfile that is a subset of the true lockfile and asks AWS lambda to deploy using that.
The lock file standard support sub-setting directly and deployment method 1 changes to: coze zip + lock file + optional list of requirement strings to resolve from the lock file - aka AWS supports doing the subset because the lock standard does. If the optional list of requirements is not present, use the whole lock.
Concretely then, the user deploys to AWS lambda by handing it their code zip, their repoâs single lockfile, the requirements list [âbazâ] to resolve from the lock.
John provides a great, real-life example. Hatch is going to enforce the first paradigm (as that is basically the reason for the concept of Hatch environments):
I have so many thoughts on this but Iâm trying not to dominate the conversation so Iâll keep it short.
This mightâve been rhetorical, but I think the answer would differ for resolving (updating) vs. installing. My preference would be:
Installers must be able to install from any lockfile regardless of [tool.xxx] metadata. This also puts some constraint on resolvers, since they canât require the use of any [tool.xxx] at install-time.
Resolvers can reject an existing lockfile if it contains [tool.xxx] metadata, and theyâre asked to update it (or even if it lacks [tool.xxx] metadata and the current tool is xxx).
At least for us, I donât see why weâd need [tool.xxx] for installs. Weâd mostly use it for bookkeeping during resolution.
Related to the comments above about Dependabot, from my perspective, this preserves the core benefits of standardization:
Dependabot support (at least, alerts, but not updates)
Cloud-provider installation
Installer interoperability (e.g., use pip to install your PDM project)
From this perspective, the PEP would be trying to standardize on a single file format that could replace poetry.lock, uv.lock, etc., while focusing the benefits of standardization around the installer operations. (It would be a non-goal for, e.g., you to take a Poetry-produced lockfile and run uv add flask to update it. I think this is both harder to achieve and less valuable.)
Critically, though, weâd still be trying to obviate the tool-specific file formats. This is different than f we decided that the scope of the PEP was to create an interoperable format for installers only. That would actually make things a lot easier (itâd be like the âlockedâ requirements.txt format that tools use today, except standardized and with all the information you need to install (like URLs, rather than just versions). In that world, we could probably even get rid of [tool.xxx] entirely which would be great for ensuring spec compliance. But it has the downside that users have to deal with and learn multiple files (both poetry.lock and pylock.toml).
Lastly: if everyone prefers it and we put the markers on the nodes rather than edges, itâs fine, itâs not a deal-breaker. I think weâll still be able to support âmultiple entrypoints to the lockfileâ even without writing extra metadata to [tool.uv]. But itâd be nice if the PEP decided that this was explicitly supported or an explicit non-goal. Otherwise, we might get it working but only âby accidentâ due to incidental details in the format that could change over time.
Yes, but 2 effectively advances the state of the art no-where. Users can already subset a Pex lock to a hashed requirements.txt. Iâm only really interested in 1. If 2 is all I get, then this whole PEP exercise just means I implement a new export format from Pex lock that is the new standard. There is no motivating reason to actually switch to the standard afaict.
Thank you. Thatâs an excellent real life example, and given Ofekâs response, itâs clear that thereâs a desire for both âinstall this lockfileâ (unqualified) and âinstall this subset of the lockfileâ.
One question - you say âthe requirements listâ. Can you describe *precisely" what youâd expect here? Because a requirements list can contain things like foo>2.0, foo[some_extra]; python_version > "3.8", or foo @ https://some/url. I imagine none of them would be suitable for defining a subset of a lockfile, which is why I want to be clear how we specify a subset.
It wasnât, so thanks for answering. Iâm happy with that answer, but I will note that it implies that youâre committing workflow tools to limit their functionality to what the lockfile format supports. Which explains why you are pushing for the format to support all of uvâs functionality, but means that reducing the scope of the standard to well-understood functionality, leaving more experimental features to a later iteration of the spec (when tools have had a chance to determine the best approach), is difficult, if not impossible, to achieve.
OK, that makes sense. If it is what @brettcannon intends for the PEP, then itâll be worth stating that explicitly. Otherwise we will get people (by analogy with pyproject.toml) expecting lockfiles to be portable between tools.
I mean any PEP-508 requirement specifier - period. Its either satisfiable in the lock or it isnât. This is how Pex locks work today. From your examples, foo>2.0 and foo[some_extra]; python_version > "3.8" and foo @ https://some/url would all work.
For example:
:; pex3 lock create --pip-version 24.2 --style universal --interpreter-constraint ">=3.8" requests "certifi @ https://github.com/certifi/python-certifi/archive/445b9cd2539f51b0aec4971a8ec02ded3943327f.zip" --indent 2 -o lock.json
:; jq '.locked_resolves[] | .locked_requirements[] | select(.project_name == "certifi")' lock.json
{
"artifacts": [
{
"algorithm": "sha256",
"hash": "a9ef7809e30370137ed69d89e45e3c36515d36a87649d8253251df4bfb038174",
"url": "https://github.com/certifi/python-certifi/archive/445b9cd2539f51b0aec4971a8ec02ded3943327f.zip"
}
],
"project_name": "certifi",
"requires_dists": [],
"requires_python": ">=3.6",
"version": "2024.8.30"
}
:; pex3 lock export-subset --format pip-no-hashes "certifi @ https://github.com/certifi/python-certifi/archive/445b9cd2539f51b0aec4971a8ec02ded3943327f.zip" --lock lock.json
certifi==2024.8.30
:; pex3 lock export-subset --format pip-no-hashes "certifi" --lock lock.json
certifi==2024.8.30
:; pex3 lock export-subset --format pip-no-hashes "certifi < 2024.8.30" --lock lock.json
Failed to resolve compatible artifacts from lock lock.json for 1 target:
1. /home/jsirois/.local/bin/tools.venv/bin/python:
Failed to resolve all requirements for cp311-cp311-manylinux_2_39_x86_64 interpreter at /home/jsirois/.local/bin/tools.venv/bin/python from lock.json:
Configured with:
build: True
use_wheel: True
Dependency on certifi not satisfied, 1 incompatible candidate found:
1.) certifi 2024.8.30 does not satisfy the following requirements:
<2024.8.30 (via: certifi<2024.8.30)
And, for an interior node:
:; pex3 lock export-subset urllib3 --lock lock.json
urllib3==2.2.3 \
--hash=sha256:ca899ca043dcb1bafa3e262d73aa25c465bfb49e0bd9dd5d59f1d0acba2f8fac \
--hash=sha256:e7d814a81dad81e6caf2ec9fdedb284ecc9c73076b62654547cc64ccdcae26e9
:; pex3 lock export-subset "urllib3>=2" --lock lock.json
urllib3==2.2.3 \
--hash=sha256:ca899ca043dcb1bafa3e262d73aa25c465bfb49e0bd9dd5d59f1d0acba2f8fac \
--hash=sha256:e7d814a81dad81e6caf2ec9fdedb284ecc9c73076b62654547cc64ccdcae26e9
:; pex3 lock export-subset urllib3[brotli] --lock lock.json
Failed to resolve compatible artifacts from lock lock.json for 1 target:
1. /home/jsirois/.local/bin/tools.venv/bin/python:
Failed to resolve all requirements for cp311-cp311-manylinux_2_39_x86_64 interpreter at /home/jsirois/.local/bin/tools.venv/bin/python from lock.json:
Configured with:
build: True
use_wheel: True
Dependency on brotli (via: urllib3[brotli] -> brotli>=1.0.9; platform_python_implementation == "CPython" and extra == "brotli") not satisfied, no candidates found.
Hmm, OK. I think Iâd need to see how this works in practice - especially for a graph-based lockfile with multiple roots - before Iâd be 100% comfortable supporting this. It feels like thereâs a bunch of edge cases weâd need to pin down. It probably needs the âhow to install from a lockfileâ algorithm to be explicitly written out.
Suppose we decided not to support multiple roots in the lockfile spec, because we werenât sure that all the issues had been ironed out yet. Then uv couldnât support multiple roots, because the standard doesnât support it, and you canât handle it in the tool.uv section because weâve said that canât affect what gets installed.
And worse, because nobody can support multiple roots without violating the spec, nobody can work on ironing out those issues so that we could add multiple root support in a later version of the spec.
fwiw, i do think it would be ideal (for selfish user reasons) if lockers would be able to lock from an arbitrary starting lockfile. Though, not that you should expect two users to use two different tools on the same project seamlessly.
Today, converting from uv to poetry or vice versa means effectively losing any lockfile state you might have previously had. It would be ideal if uv lock on a poetry-produced lockfile would effectively turn it into a uv-produced lockfile (not needing to retain any tool.poetry data), but basing the resolution off the original fileâs locked versions. Insofar as poetry lock or uv lock today both work with and without a preexisting lock file, and behave differently in those two cases, it at least doesnât seem obvious that it should error.
With that said, it obviously doesnât need to be a PEP requirement of the tools, but it would be an ideal end-user outcome.
First, apologies in the delay in responding! Just did the first driving trip with the baby and Iâm at the core dev sprints, so lots going on! But thanks to everyone carrying on the conversation w/o me.
Same, which is whatâs making balancing what ends up in the PEP so hard. I fully understand we want users who are ultimately who will use installers to be happy, but I also realize that we canât put too much onus on lockers which could drive them to not wanting to support the PEP.
And I still expect to support this use case. The PEP will never require that you support all possible platforms.
Yes! Your anyio example I think is a good illustration of the two approaches: do you just write at the place where you depend no anyio (i.e. the edge from what depends on anyio; Requires-Dist), or do you try to flatten all the required information into anyioâs entry in the lock file (i.e. the node)?
The trick with that is then youâre supporting both approaches at once. Itâs not the worst thing (if you can flatten the graph to write out a set of packages then recording the graph you flattened from is also possible), but itâs setting expectations for users as to what can be relied on.
I canât think of a scenario where an installer should be allowed to not support installing for multiple environments as the same stuff to detect whether an environment is supported by the lock file would be necessary.
Thatâs the plan. This is also what makes figuring out the right level of feature set so hard.
What does Dependabot do when it updates a pyproject.toml file that has a [tool] section? Does it just leave it alone?
We could also record the tool or command used to create the lock if necessary so that something like Dependabot could know what it could do to recreate the lock file so any [tool] sections that the user wants to have.
Thatâs actually the motivating example for supporting different lock files with different names, e.g., pylock.aws-lambda.toml.
Iâm okay with that. The thing we canât know about [tool] is whether itâs metadata thatâs fine to leave around and actually isnât affected by the lock changing, or if itâs critical somehow.
Thatâs the whole reason Iâm talking about this as I donât want ambiguity for what lockers are expected to produce and thus what installers can rely on.
I think to answer that question we should decide what we expect Dependabot to do here. If we are okay w/ somehow having the lock file record the tool used to generate it so Dedendabot can also use that same tool in a similar way, then I think thatâs a reasonable compromise. But if we donât want to do that then we need to decide whether having [tool] is worth it or if tools like Dependabot just canât regenerate a lock file in the same way as the user may have.
That is not what would happen in this example. There would be no motivation to store, say, 10 subset lock files of the real single central lock file. Instead, the subset lock would be created just in time from the central lock for export to AWS with the zip file. As I mentioned before, this is ~no different than subset exporting to the current requirements.txt format and would not be a motivation for Pex to switch its central format. It would just now support a new export format alongside requirements.txt unfortunately.
Oh, I know thatâs what we would like to see happen. But I also want to acknowledge what we think should happen and what actually happens can be widely different.
I donât follow your âweâsâ. Iâm saying thatâs what Pants / Pex users would actually do. Use 1 central lock file and, instead of sub-setting 10 different lambda lock files to check in to version control along side it redundantly, just export just in time as needed.