PEP 665, take 2 -- A file format to list Python dependencies for reproducibility of an application

No, but some offline feedback I got was that some of the worries about not supporting sdists may be coming from people feeling that if it isn’t in the PEP that it can’t be solved outside of it.

2 Likes

OK, but sdist support is covered by the PEP, at least to the extent that it is currently an open issue and presumably at some point will be moved to “rejected ideas”. So there will certainly have to be some information in the PEP, even if the implementation details remain open.

I’d expect that when sdists are moved to the “rejected ideas” section (assuming that’s what happens), the following points would be made:

  1. The PEP clarifies that while there are use cases for sdist support, those are considered out of scope for PEP 665.
  2. It’s explicitly noted that adding sdist support via a follow-up PEP was discussed and is how this PEP expects sdist support to be added at a future date, if required.
  3. I’d imagine a key reason for the rejection would be because it’s much harder to achieve reproducibility for sdists - so the rejection should note this and clarify that any future sdist support PEP needs to describe how reproducibility will be handled.

I don’t know whether that matches people’s expectations about “solving sdist support outside of PEP 665”. In particular, though, if people want to experiment with implementing such support, they will need to write their own lockfile installer - pip isn’t the place for such experimentation.

To be explicit, pip will need sdist support to be backed by an approved PEP before we add it.

I’d also like to see (assuming sdist support gets rejected) an explicit paragraph in the motivation section explaining that the PEP chooses to only support wheels because that allows reproducibility to be guaranteed without needing build systems to provide reproducibility guarantees as well (which, as far as I know, none of them formally do). I think that would make the current emphasis in the PEP on reproducibility less distracting for people who don’t have a strong need for it.

Sure, but the rest of the PEP is written as if sdists are not supported.

Yes.

Assuming Supporting sdists and source trees in PEP 665 doesn’t lead to sdists making into this PEP, I would summarize what’s there which includes what needs to be resolved by any future PEP.

PEP 665 -- A file format to list Python dependencies for reproducibility of an application | Python.org was meant to capture that, but I can stengthen it a bit to be more explicit about this.

Added a paragraph in PEP 665: point out why relying on wheels is a good thing · python/peps@e4d35d7 · GitHub.

2 Likes

Poetry has said they won’t support this PEP, both from a standardization and export viewpoint.

Not supporting sdists, source trees, and VCSs is one sticking point. The other is the per-file dependencies which was added to the PEP after Donald provided direct feedback on that very topic.

I have recorded this feedback in PEP 665: record Poetry's views on the PEP · python/peps@6bbde29 · GitHub .

2 Likes

I just posted a new draft of the PEP with sdist support listed in the rejected section. That closes out all open issues!

Rendered versions at:

1 Like

So now that the PEP no longer supports sdists, can we do a review of whether it still covers enough use cases to be viable? In my experience, I don’t think I’ve ever heard anyone ask for pip to support reproducible installs explicitly, so while I get that the idea is that PEP 665 is about ensuring reproducible installs, what evidence do we have that enough people actually want reproducible installs to make it worth having a standardised lockfile whose main (sole?) purpose is to provide them?

In particular, there’s already a conversation starting about a follow-up proposal for adding sdist support. I’m not particularly happy about the possibility that PEP 665 can’t stand on its own merits and is merely a starting point for adding sdists. We have enough PEPs that have been approved but no-one is working on implementing them, that I don’t want to add another one to that list…

I know of enterprise users who solve this problem by having a PyPI mirror because pip isn’t secure by default in terms of what it installs. They can’t trust users to use any other index but their own private one that only contains code that has been cleared for use (i.e. tightly control what dependencies may get pulled in).

I know @kushaldas greatly cares about reproducible installs for SecureDrop.

For me personally, I have to audit every call to pip in CI for work to make sure that every flag that is mentioned in PEP 665 -- A file format to list Python dependencies for reproducibility of an application | Python.org is used to avoid supply chain attacks.

1 Like

Sorry, I explained badly. I wasn’t asking about reproducibility in its own right, I was asking about whether there was still sufficient demand for PEP 665, in its new form.

The original justification for PEP 665 was that “people want lockfiles” - this is pretty obvious, we have projects like pip-tools, poetry, pipenv, etc, all providing some form of lockfile functionality, so there’s clearly a need.

The complexities of supporting sdists in line with the goal of reproducibility meant that the PEP dropped the idea of supporting them, and focused solely on wheels. That’s fine, but it means that many of the users who we’d previously assumed would benefit from PEP 665 will no longer be able to use it. What proportion? I don’t know, and that’s essentially what I’m asking.

We can’t assume that people using a PyPI mirror will be able to use the new PEP, as they will be mirroring sdists as well, and may well need to install them. That applies to any cases of people concerned about supply chain attacks - it’s possible to mitigate the risk while still using sdists, and it’s entirely possible that solutions which block sdists won’t be acceptable in all cases.

OK, let’s put the question to an informal poll:

  • PEP 665 without wheel support is sufficient for my use cases
  • I won’t be able to use PEP 665 until sdist support is added
  • I will use PEP 665 without sdists, but I will need to handle sdists manually for my workflow
  • I don’t need PEP 665, I use existing solutions and am happy with them
  • Lock files don’t matter to me, and/or I have no opinion

0 voters

One major problem with relying on votes is that “people following the PEP discussion” isn’t a particularly representative group. If someone wants to reach out to the wider community for feedback, that would be helpful here.

I just noticed, I inadvertently worded the “PEP 665 is sufficient” option as “PEP 665 without wheel support is sufficient for my use cases” Aargh - and I can’t edit the poll now :slightly_frowning_face:

I hope it was obvious to everyone that I meant “PEP 665 with wheel support is sufficient for my use cases”. If anyone wants to change their vote, please comment. Or if people think this is sufficient of a mistake that I should restart the poll, then I’m happy to do that too.

Sorry for the error.

How long are you planning to leave this poll open, @pf_moore ?

1 Like

I hadn’t really thought about it, I was intending to wait until voting died down and take a view then. How about we say a week?

1 Like

For those of you who do support PEP 665, without vocal support for the PEP as-is or getting sdists added to get the “sdist vote”, this PEP is heading towards rejection.

As such, for those of you voting for needing sdist support, please see A file format to list Python dependencies of an application without strict reproducibility guarantees and provide feedback there.

1 Like

I voted for the won’t use until sdist support, but also don’t feel a strong place to comment on the other thread. Reading the other thread on sdist extension it feels like a good fit for my use case. My main goal is there exists some input source whether it’s same file or a different file I’m ambivalent on for specifying sdists. The main thing I want is a clear file or two that will allow assuming locker/installer to do,

-e relative_path1
-e relative_path2

wheel_dep1
wheel_dep2
wheel_dep3

The new design with url specifying file directory and editable looks fine. It looks weird that editable is under hashes as naming wise I don’t associate the two, but that’s minor aesthetic thing.

Also looks quite verbose of a way to specify requirements. I enjoy my current requirements.in file that looks like,

-e file:.
-e file:lib1
-e file:libs/lib2
-e file:libs/lib3
-e file:libs/lib4
-e file:lib5

I think verbosity/complexity of writing a simple file like that will slow adoption some. It’s still a strong fit for my team’s usage pattern that I think is beneficial and I plan to migrate when ready, but simple input file format converted to a more detailed file format would be preferable flow vs writing this,

[lib1.lib1."1.5"]
url = "lib1"
filename = ""
hashes.editable = True

[lib2.lib2."1.5"]
url = "libs/lib2"
filename = ""
hashes.editable = True
...

This may also just be confusion. Is pypin file I’m supposed to write as a user or will it be generated similar to the pylock file too? If pypin file is generated too is locker now defined as producer of both pypins and pylocks from custom input file?

1 Like

I would assume a pin file would be generated.

Probably.

1 Like

Should we start a clean poll with agreed questions? The poll is quite confusing to me when I read it and it can easily lead to mistaken votes.

I would suggest first agreeing on the questions. My suggestions.

  • PEP 665 only supporting wheels is not usable at all for my needs without sdist support
  • PEP 665 only supporting wheels is useful as a starting point and I will workaround lack of sdist support
  • PEP 665 only supporting wheels meets all my current needs
  • Abstain
1 Like

Please understand that the poll is informal, and is purely to collect feedback from people who want an easier way of responding than posting a comment. If you aren’t sure how to respond, feel free to say so and state your view as a comment.

Your questions seem to match mine pretty well (although your wording is probably slightly clearer). So if the questions you propose had been used, what would you have replied? For reference, I’d map your questions to the ones given as:

  • PEP 665 only supporting wheels is not usable at all for my needs without sdist support → I won’t be able to use PEP 665 until sdist support is added
  • PEP 665 only supporting wheels is useful as a starting point and I will workaround lack of sdist support → I will use PEP 665 without sdists, but I will need to handle sdists manually for my workflow
  • PEP 665 only supporting wheels meets all my current needs → PEP 665 with wheel support is sufficient for my use cases (as I noted, I mistyped “without” when I meant “with”, and now Discourse won’t let me change it. My apologies for that. I asked at the time whether it was worth restarting the poll, but no-one said it was.)
  • Abstain → Lock files don’t matter to me, and/or I have no opinion (Actually “abstain” is closer to simply not responding, I added this option in case anyone wanted to explicitly say they don’t care).

You missed out the option of “I don’t need PEP 665, I use existing solutions and am happy with them” (which, if it’s not clear, means someone who uses a tool like pip-tools, pdm or poetry and doesn’t have any issue with their current approach, which is usually a requirements file with fully pinned versions).

I should also say that I’d actually much rather that people comment and contribute to the discussion, rather than just picking an option from the poll. For this PEP (or any PEP) to succeed, it needs community consensus, and since the version of the PEP that dropped sdist support was released, there’s been little or no reaction. No-one has said “cool, that simplifies the implementation and is perfectly sufficient for what I need”, no-one has said “ah well, we can use this and I guess we can install the sdists we need manually for now”, and no one has said “you’re all mad, this is useless without sdists” :slightly_smiling_face: There’s been essentially silence, and it’s hard to interpret that as anything other than “no-one cares” or at best “everyone’s tired of the whole debate at this point”. In spite of its flaws, the poll has at least given some sense of what people think of this version of the PEP.

1 Like

I’ll expand on difference for me between pep 665 having editable support vs handling them manually. If locker is allowed to reject an input file that includes relative paths then specifying I want to lock these 3 editables + dependencies becomes not doable directly with locker/installer. At that point I’m forced to construct a tool that constructs file of all top level dependencies of the sdists I wanted to lock. This tool for me would be read pyproject.toml/setup.cfg files of each editable dependency, output a new file that is locker’s input, and then run the locker on that. How will this custom tool handle extras logic? Do I need to force packages to all use one configuration format (unify to pyproject.toml) or do I need more logic to read various ways of checking dependencies a package declares?

And what do I gain from doing these workarounds vs using pip-compile and turning off hashes? I’m made a small wrapper mypy tool before and even though wrapper was meant to be mostly use mypy it still grew to a few hundred lines to maintain. Sadly pip compile + hashes doesn’t directly work with editables as output locked file is uninstallable with pip. So at the moment my locked file to support editables just avoids hashes.

So the manual workaround path is “doable” but complicates CI and maintenance enough for me as a library producer that if I need workaround for editable support I’ll just skip pep and continue using pip-compile until it handles them directly. My primary job at work is not packaging/repository maintenance. It’s a secondary role I handle for my team, but if I’m going to change our packaging tooling it should be a. easy to do, and b. clearly beneficial. My main reason to migrate to pep 665 locker with editable support would be a real solution for editable + hashes vs dropping all hashes. The secondary reason is maybe ides/other tooling can use lock file for useful recommendations.

1 Like

As a both application and library developer, I’m happy with sdist support being simply building sdists and checking the resultant wheel matches against the hash (if that’s possible). I would likely use a tool to perform that locally, rather than in a proxy index server.

All of our applications depend on projects only available as sdists, so not supporting them at all means I can’t use this PEP. For libraries, I vehemently believe in minimal constraint, and encourage team members to have different versions of dependencies to ensure flexibility.

2 Likes

Much like @brettcannon I work in an environment where supply chain attacks are a real concern and the use of direct PyPI is being locked-down. Requiring dependencies to be taken on “managed wheels only” is sufficient and desirable in our environment. If there is a dependency that is only available as an sdist, it is inspected, built, and added to our index.

The benefit to having lockfiles in pip is that for such a common use-case we don’t have to rely on the variety of different tools in the ecosystem or with individuals who think that pip freeze is the right way to produce a good lockfile.

5 Likes