PEP 665, take 2 -- A file format to list Python dependencies for reproducibility of an application

I just added the following to try and make it as clear as possible that the PEP is flexible around anything it doesn’t specify on purpose:


As Flexible as Possible

Realizing that workflows vary greatly between companies, projects, and
even people, this PEP tries to be strict where it’s important and
undefined/flexible everywhere else. As such, this PEP is strict where
it is important for reproducibility and compatibilitiy, but does not
specify any restrictions in other situations not covered by this PEP;
if the PEP does not specifically state something then it is assumed to
be up to the locker or installer to decide what is best.

2 Likes

Note that I specifically said “input” without specifying where the input should come from :wink:

The input can come from the lock file used for installation, but it can also come from additional user inputs (e.g. command line options, configuration files, or environment variables) specified to the locker and installer, or even only available in the original application manifest without the information being locked.

Also note that what indexes were used to generate the lock file is inheritantly not meaningful knowledge to the installer, since the lock file already provides enough information for the installer to find exact artifacts without index knowledge. So the installer only needs to allow two use cases:

  1. No index override, where the installer simply uses the artifact URLs provided by the lock file.
  2. An explicit index override, where the installer ignores all the artifact URLs and find artifacts based on versions, filenames, hashes etc. instead.
3 Likes

While I understand the motivation here, how do we avoid lockers depending on a particular installer? (Past experience leads me to believe that people will expect anything install-related to be handled by pip, so “what pip does” could end up being a de facto standard for anything not covered by the PEP).

It’s possible this won’t be an issue in practice - do you have an example of something where the PEP deliberately doesn’t restrict something in the way you describe?

No, but some offline feedback I got was that some of the worries about not supporting sdists may be coming from people feeling that if it isn’t in the PEP that it can’t be solved outside of it.

2 Likes

OK, but sdist support is covered by the PEP, at least to the extent that it is currently an open issue and presumably at some point will be moved to “rejected ideas”. So there will certainly have to be some information in the PEP, even if the implementation details remain open.

I’d expect that when sdists are moved to the “rejected ideas” section (assuming that’s what happens), the following points would be made:

  1. The PEP clarifies that while there are use cases for sdist support, those are considered out of scope for PEP 665.
  2. It’s explicitly noted that adding sdist support via a follow-up PEP was discussed and is how this PEP expects sdist support to be added at a future date, if required.
  3. I’d imagine a key reason for the rejection would be because it’s much harder to achieve reproducibility for sdists - so the rejection should note this and clarify that any future sdist support PEP needs to describe how reproducibility will be handled.

I don’t know whether that matches people’s expectations about “solving sdist support outside of PEP 665”. In particular, though, if people want to experiment with implementing such support, they will need to write their own lockfile installer - pip isn’t the place for such experimentation.

To be explicit, pip will need sdist support to be backed by an approved PEP before we add it.

I’d also like to see (assuming sdist support gets rejected) an explicit paragraph in the motivation section explaining that the PEP chooses to only support wheels because that allows reproducibility to be guaranteed without needing build systems to provide reproducibility guarantees as well (which, as far as I know, none of them formally do). I think that would make the current emphasis in the PEP on reproducibility less distracting for people who don’t have a strong need for it.

Sure, but the rest of the PEP is written as if sdists are not supported.

Yes.

Assuming Supporting sdists and source trees in PEP 665 doesn’t lead to sdists making into this PEP, I would summarize what’s there which includes what needs to be resolved by any future PEP.

PEP 665 – A file format to list Python dependencies for reproducibility of an application | peps.python.org was meant to capture that, but I can stengthen it a bit to be more explicit about this.

Added a paragraph in PEP 665: point out why relying on wheels is a good thing · python/peps@e4d35d7 · GitHub.

2 Likes

Poetry has said they won’t support this PEP, both from a standardization and export viewpoint.

Not supporting sdists, source trees, and VCSs is one sticking point. The other is the per-file dependencies which was added to the PEP after Donald provided direct feedback on that very topic.

I have recorded this feedback in PEP 665: record Poetry's views on the PEP · python/peps@6bbde29 · GitHub .

2 Likes

I just posted a new draft of the PEP with sdist support listed in the rejected section. That closes out all open issues!

Rendered versions at:

1 Like

So now that the PEP no longer supports sdists, can we do a review of whether it still covers enough use cases to be viable? In my experience, I don’t think I’ve ever heard anyone ask for pip to support reproducible installs explicitly, so while I get that the idea is that PEP 665 is about ensuring reproducible installs, what evidence do we have that enough people actually want reproducible installs to make it worth having a standardised lockfile whose main (sole?) purpose is to provide them?

In particular, there’s already a conversation starting about a follow-up proposal for adding sdist support. I’m not particularly happy about the possibility that PEP 665 can’t stand on its own merits and is merely a starting point for adding sdists. We have enough PEPs that have been approved but no-one is working on implementing them, that I don’t want to add another one to that list…

I know of enterprise users who solve this problem by having a PyPI mirror because pip isn’t secure by default in terms of what it installs. They can’t trust users to use any other index but their own private one that only contains code that has been cleared for use (i.e. tightly control what dependencies may get pulled in).

I know @kushaldas greatly cares about reproducible installs for SecureDrop.

For me personally, I have to audit every call to pip in CI for work to make sure that every flag that is mentioned in PEP 665 – A file format to list Python dependencies for reproducibility of an application | peps.python.org is used to avoid supply chain attacks.

1 Like

Sorry, I explained badly. I wasn’t asking about reproducibility in its own right, I was asking about whether there was still sufficient demand for PEP 665, in its new form.

The original justification for PEP 665 was that “people want lockfiles” - this is pretty obvious, we have projects like pip-tools, poetry, pipenv, etc, all providing some form of lockfile functionality, so there’s clearly a need.

The complexities of supporting sdists in line with the goal of reproducibility meant that the PEP dropped the idea of supporting them, and focused solely on wheels. That’s fine, but it means that many of the users who we’d previously assumed would benefit from PEP 665 will no longer be able to use it. What proportion? I don’t know, and that’s essentially what I’m asking.

We can’t assume that people using a PyPI mirror will be able to use the new PEP, as they will be mirroring sdists as well, and may well need to install them. That applies to any cases of people concerned about supply chain attacks - it’s possible to mitigate the risk while still using sdists, and it’s entirely possible that solutions which block sdists won’t be acceptable in all cases.

OK, let’s put the question to an informal poll:

  • PEP 665 without wheel support is sufficient for my use cases
  • I won’t be able to use PEP 665 until sdist support is added
  • I will use PEP 665 without sdists, but I will need to handle sdists manually for my workflow
  • I don’t need PEP 665, I use existing solutions and am happy with them
  • Lock files don’t matter to me, and/or I have no opinion

0 voters

One major problem with relying on votes is that “people following the PEP discussion” isn’t a particularly representative group. If someone wants to reach out to the wider community for feedback, that would be helpful here.

I just noticed, I inadvertently worded the “PEP 665 is sufficient” option as “PEP 665 without wheel support is sufficient for my use cases” Aargh - and I can’t edit the poll now :slightly_frowning_face:

I hope it was obvious to everyone that I meant “PEP 665 with wheel support is sufficient for my use cases”. If anyone wants to change their vote, please comment. Or if people think this is sufficient of a mistake that I should restart the poll, then I’m happy to do that too.

Sorry for the error.

How long are you planning to leave this poll open, @pf_moore ?

1 Like

I hadn’t really thought about it, I was intending to wait until voting died down and take a view then. How about we say a week?

1 Like

For those of you who do support PEP 665, without vocal support for the PEP as-is or getting sdists added to get the “sdist vote”, this PEP is heading towards rejection.

As such, for those of you voting for needing sdist support, please see A file format to list Python dependencies of an application without strict reproducibility guarantees and provide feedback there.

1 Like

I voted for the won’t use until sdist support, but also don’t feel a strong place to comment on the other thread. Reading the other thread on sdist extension it feels like a good fit for my use case. My main goal is there exists some input source whether it’s same file or a different file I’m ambivalent on for specifying sdists. The main thing I want is a clear file or two that will allow assuming locker/installer to do,

-e relative_path1
-e relative_path2

wheel_dep1
wheel_dep2
wheel_dep3

The new design with url specifying file directory and editable looks fine. It looks weird that editable is under hashes as naming wise I don’t associate the two, but that’s minor aesthetic thing.

Also looks quite verbose of a way to specify requirements. I enjoy my current requirements.in file that looks like,

-e file:.
-e file:lib1
-e file:libs/lib2
-e file:libs/lib3
-e file:libs/lib4
-e file:lib5

I think verbosity/complexity of writing a simple file like that will slow adoption some. It’s still a strong fit for my team’s usage pattern that I think is beneficial and I plan to migrate when ready, but simple input file format converted to a more detailed file format would be preferable flow vs writing this,

[lib1.lib1."1.5"]
url = "lib1"
filename = ""
hashes.editable = True

[lib2.lib2."1.5"]
url = "libs/lib2"
filename = ""
hashes.editable = True
...

This may also just be confusion. Is pypin file I’m supposed to write as a user or will it be generated similar to the pylock file too? If pypin file is generated too is locker now defined as producer of both pypins and pylocks from custom input file?

1 Like

I would assume a pin file would be generated.

Probably.

1 Like

Should we start a clean poll with agreed questions? The poll is quite confusing to me when I read it and it can easily lead to mistaken votes.

I would suggest first agreeing on the questions. My suggestions.

  • PEP 665 only supporting wheels is not usable at all for my needs without sdist support
  • PEP 665 only supporting wheels is useful as a starting point and I will workaround lack of sdist support
  • PEP 665 only supporting wheels meets all my current needs
  • Abstain
1 Like

Please understand that the poll is informal, and is purely to collect feedback from people who want an easier way of responding than posting a comment. If you aren’t sure how to respond, feel free to say so and state your view as a comment.

Your questions seem to match mine pretty well (although your wording is probably slightly clearer). So if the questions you propose had been used, what would you have replied? For reference, I’d map your questions to the ones given as:

  • PEP 665 only supporting wheels is not usable at all for my needs without sdist support → I won’t be able to use PEP 665 until sdist support is added
  • PEP 665 only supporting wheels is useful as a starting point and I will workaround lack of sdist support → I will use PEP 665 without sdists, but I will need to handle sdists manually for my workflow
  • PEP 665 only supporting wheels meets all my current needs → PEP 665 with wheel support is sufficient for my use cases (as I noted, I mistyped “without” when I meant “with”, and now Discourse won’t let me change it. My apologies for that. I asked at the time whether it was worth restarting the poll, but no-one said it was.)
  • Abstain → Lock files don’t matter to me, and/or I have no opinion (Actually “abstain” is closer to simply not responding, I added this option in case anyone wanted to explicitly say they don’t care).

You missed out the option of “I don’t need PEP 665, I use existing solutions and am happy with them” (which, if it’s not clear, means someone who uses a tool like pip-tools, pdm or poetry and doesn’t have any issue with their current approach, which is usually a requirements file with fully pinned versions).

I should also say that I’d actually much rather that people comment and contribute to the discussion, rather than just picking an option from the poll. For this PEP (or any PEP) to succeed, it needs community consensus, and since the version of the PEP that dropped sdist support was released, there’s been little or no reaction. No-one has said “cool, that simplifies the implementation and is perfectly sufficient for what I need”, no-one has said “ah well, we can use this and I guess we can install the sdists we need manually for now”, and no one has said “you’re all mad, this is useless without sdists” :slightly_smiling_face: There’s been essentially silence, and it’s hard to interpret that as anything other than “no-one cares” or at best “everyone’s tired of the whole debate at this point”. In spite of its flaws, the poll has at least given some sense of what people think of this version of the PEP.

1 Like