PEP 751: lock files (again)

Although PEP 517 explicitly states that the graph of build requirements MUST NOT contain cycles. A lot of that bootstrapping discussion is about trying to avoid conforming to that requirement (because in practice, it’s an extremely strict restriction on what dependencies a build backend can have). So I don’t think that discussion is relevant here. What is relevant, is that locking might need to lock multiple levels of build backend. In practice, unless people actively prohibit using wheels for build backends, it won’t be a real problem. But it should be covered here simply so that we don’t add more fuel to the arguments around bootstrapping.

1 Like

Yep, that’s another case to consider.

Yes, that’s why the build-requires is a per [[packages]] entry.

By not supporting the locking for the build back-end? :wink:

I’m actually not kidding about this idea. Just yesterday I realized that Poetry, PDM, and uv don’t bother locking build back-ends when they have to work with sdists (I verified this w/ pyspark · PyPI and all 3 just lock pyspark and py4j).

I had put in locking build back-ends to alleviate the risk that people wanting sdists wouldn’t sink this PEP like it did PEP 665, but I’m starting to think a over-corrected and instead should just say, “sdists are acceptable,” and leave it at that.

2 Likes

The pragmatic approach to build dependency locking is declaring PIP_CONTRAINTS_FILE when doing an installation from a requirements file.

Similar tricks could be used when installing from locked source distributions.

There’s two problems with that approach,

  1. It’s pip specific.
  2. Pip does not actually follow hash requirements for constraints today and just silently ignores them as discussed in that issue.

So at moment pip lacks any way to do build constraints unless you install build constraints separately and do no build isolation. Part of goal of the recent questions is to come up with standardized way to better handle reproducible build constraints over undefined (and happens to be buggy) pip behavior.

1 Like

Yeah, I think this is the way to approach it too. Tools are not prevented from innovating here - being able to specify a separate lock file for build environments should be allowed,[1] or a tool may choose to allow picking a lock file per-package based on some other information.

The “simple” case of “I just got a random repo, go and download all the packages with whatever tool I happen to be using” really doesn’t have to care this much. If the user cares, they can choose the correct tool, constrain it in some other way, or avoid building.


  1. Allowed in the sense of “not forbidden by the interoperability specification a.k.a. this PEP”. Tools can offer it if they want, or not, depending on their target scenarios. ↩︎

1 Like

Agreed. I don’t think recursive build requirements are a significant use case, except for the bootstrapping situation. And for bootstrapping, building wheels of the various build backends using a lockfile, and then using those pre-built backends in your lockfiles for further packages, is IMO a reasonable solution.

Given the trouble that the “no cycles in the build dependency tree” requirement of PEP 517 is causing[1], I would like the lockfile PEP to explicitly point out this limitation, and document the recommended approach of pre-building backends using their own lockfiles.

From a pip perspective, I don’t think we’re likely to do anything beyond the standard here. Certainly not in the short term - we don’t have the maintainer bandwidth. In fact, from a purely personal point of view, I’d hope that pip will implement the lockfile standard once it’s approved, allowing installation of a locked set of requirements into a target environment, and make that our supported “install locked requiremenst” mechanism, declaring existing feaures like --require-hashes to be supported-but-no-longer-developed (a status the commercial software industry in my experience likes to describe as “stabilised” :roll_eyes:) Any further developments in capability can then be tied to the standards process.

Which reminds me of another question. The PEP (as far as I know) doesn’t currently say anything about installing a lockfile into a non-empty target environment. I think it deserves a mention, if only because people tend to only think about installing into an empty environment until it comes to the point of implementing the spec, when all the weird edge cases appear. I don’t think it needs to be complex:

  • Installers MAY refuse to install into a non-empty environment.
  • If they allow it they MUST replace any installed copies of the packages in the lockfiles with the ones specified in the lockfile. This is so that the user can be certain that local changes made to the environment (e.g., manual edits of installed files) are no longer present.
  • If the resulting environment contains any conflicts (because the package versions in the lockfile conflict with existing installed packages not mentioned in the lockfile) then the installer MUST report an error.
  • In the latter case, of a conflict error, do we require that the installer must leave the environment unchanged, or is it OK to check after doing the install and leave the environment in a broken state? This is probably best left as a tool UX choice.

  1. mostly because the setuptools maintainers don’t want to constrain what dependencies they can use, to be fair ↩︎

5 Likes

Yes, that’s why the build-requires is a per [[packages]] entry.

Thanks, that wasn’t clear to me from the spec I think mostly because I’m not used to reading such detailed toml specs.

By not supporting the locking for the build back-end? :wink:
I’m actually not kidding about this idea. Just yesterday I realized that Poetry, PDM, and uv don’t bother locking build back-ends when they have to work with sdists (I verified this w/ pyspark · PyPI and all 3 just lock pyspark and py4j).

While tools don’t support this feature very well there is a strong desire by some users to lock build backends, which is the reason I am commenting on this thread as pip has limitations in how it currently supports restricting build backends and Paul suggested this may be solved if PEP 751 is accepted and implemented.

Currently users can use pip-tools combined with PIP_CONSTRAINT environmental variable to find (non-recursively) and pin build backends: https://github.com/jazzband/pip-tools?tab=readme-ov-file#maximizing-reproducibility

And uv users using the pip interface can pin build backends to a specific hash using various build constraints options: https://docs.astral.sh/uv/pip/compatibility/#build-constraints

If the open question is about recursive build locking, why not make it possible for build-requires to be a member of build-requires? So tools / users can optionally lock the build requirements of each build requirement.

Pip does not actually follow hash requirements for constraints today and just silently ignores them as discussed in that issue.

So at moment pip lacks any way to do build constraints unless you install build constraints separately and do no build isolation

FWIW, this isn’t correct, many users pin their build constraints using the env variable PIP_CONSTRAINT, this works fine for pinning build requirements.

The pip issue stemmed from trying to use hashes and getting errors at various points (not being silently ignored), it seems pip can’t currently use hashes to pin build requirements, hence looking to this standard to provide a way to lock build requirements.

Agreed. I don’t think recursive build requirements are a significant use case, except for the bootstrapping situation.

The motivation of some users who want to lock build requirements is they know exactly which packages (and their hashes) are being built on their machine (and therefore executing arbitrary code). I think it will be of concern to those users if this lock file format has an exception that it’s possible for some packages to be built on their machine that can’t be specified by this lock format.

How big of an issue that is I don’t know, but I do think it is a significant use case not covered by just bootstrapping.

I don’t think locking build requirements is an issue with the spec as written. The existing packages.build-requires field covers that. What it doesn’t cover is locking the build requirements of a build backend. But in reality, all build backends ship as wheels (which you can lock perfectly well), so the only case where you’d need build requirements of build requirements is if you insisted on using --no-binary :all: or some other “I’m not willing to use an existing wheel” workflow. This is the “bootstrapping the ecosystem” scenario, but I’m not aware of any other realistic use case.

2 Likes

What it doesn’t cover is locking the build requirements of a build backend.

FWIW my comment was in respect to build requirements of a build backend. That is if a build backend is allowed to be a source distribution, then it allows for a package to be installed that is not specified by the lockfile. I suppose a workaround for this is frontend installers can have an option to disallow build backends to be source distributions, but this would be up to the frontend, not the spec.

2 Likes

Sorry. There seem to be a lot of different scenarios being raised, and I’m losing track.

Are you aware of any actual use cases for this situation (where someone wants to lock a build and is either unable or unwilling to use a wheel for the build backend of one or more packages that need to be installed from source)? My point isn’t that it’s impossible for this to happen, just that it’s unlikely that anyone will actually need this, and for the purposes of the lockfile spec “we don’t support this” is sufficient. Obviously, if there are important use cases that I’m not aware of, that wouldn’t be the case.

My point was if it is allowed by the spec, as Brett suggested updating it to:

“sdists are acceptable,” and leave it at that.

Then this problematic for the use case of “I want this lock file to now allow any unknown packages to be built on my machine”, because if the build backend sdists build requirements themselves aren’t locked then unknown packages can be built on the machine not specified by the lockfile. But this is something frontend tools could prevent by having a (default?) option that build backends should not be sdists.

2 Likes

Not to sound flippant, but… how serious is this concern compared to the implicit risk already being taken by installing sdists in the first place? Unless you have examined the build code and build backend code of every sdist, and verified they’re deterministic, compliant sdists are already allowed to run arbitrary code you don’t know about.
What’s the real scenario where someone isn’t concerned about that, but IS concerned about the possibility of unexpected stuff getting installed as build requirements of a new version of a build backend of one of the sdists in the lock file?

All the nuances of Python packaging aside, I think it would be a reasonable expectation by a user of a lockfile that they would be able to have no packages installed on their machine that were not specified by the lock file.

Not to say that it would be guaranteed by any particular lockfile, but it would at least be possible to create a lockfile that did this for any given set of requirements. build backends, etc.

If that’s not possible, at least under certain circumstances, then I think it should be explicit in the PEP (as non-goal or whatever), and if possible for what circumstances that happens, so users and frontend tools can make a choice of whether to avoid these circumstances.

3 Likes

Surely that’s just “create a new, empty, virtual environment, and use an installer to install the lockfile”. I feel like I’m missing something in what you’re saying here, because that feels to me like the most basic workflow, and is absolutely supported.

Doesn’t it?

[[packages.build-requires]] is " An array of tables whose structure matches that of [[packages]]."

and [[packages]] includes [[packages.build-requires]].

Of course I agree that for all practical purposes this is a non-issue - but the PEP does seem to allow recursive build requirements, even if I expect no-one will want to do it?

Locking build requirements in the cross-platform case seems likely to be challenging…

2 Likes

I think the issue is that build-requires doesn’t include e.g. setuptools, that’s specified in a separate section (the [build-system] table).

But one can specify the version of the build tool in that section, and I’m confused about what more one would care to lock there[1]


  1. but I lost track of this discussion a while ago and I may have lost the plot ↩︎

Surely that’s just “create a new, empty, virtual environment, and use an installer to install the lockfile”. I feel like I’m missing something in what you’re saying here, because that feels to me like the most basic workflow, and is absolutely supported.

The context is the one you raise, build requirements of build backends and Brett said:

By not supporting the locking for the build back-end
[…]
“sdists are acceptable,” and leave it at that.

In this situation, the build requirements of the build backend would be installed in some isolated environment on the users machine but not specified by the lock file, right?

Clearly I’m communicating very poorly or not understanding something, I feel like I am a net negative value on this discussion so I’m not going to follow up any more unless there is some very specific point I can answer on this use case.

Ctrl-F says that there is no [build-system] table in PEP751, did you mean something else?

That’s the point, I meant the table in PEP 518. That’s where the version of setuptools[1] would be specified, and not in the lock file as written.

I might be misunderstanding the situation, though!


  1. or any other build backend ↩︎

This conversation has certainly got confusing, but I think that PEP518 has nothing much to do with it

AIUI the proposed format could lock all build requirements - including the one that provides the build backend - in [[packages.build-requires]], and this can be different per package, can even recursively express the build requirements of build requirements.

How much of this is desirable is certainly open, but in this regard the format seems to me to have space for all and more that one might reasonably want to include.