PEP 665, take 2 -- A file format to list Python dependencies for reproducibility of an application

I know of enterprise users who solve this problem by having a PyPI mirror because pip isn’t secure by default in terms of what it installs. They can’t trust users to use any other index but their own private one that only contains code that has been cleared for use (i.e. tightly control what dependencies may get pulled in).

I know @kushaldas greatly cares about reproducible installs for SecureDrop.

For me personally, I have to audit every call to pip in CI for work to make sure that every flag that is mentioned in PEP 665 – A file format to list Python dependencies for reproducibility of an application | peps.python.org is used to avoid supply chain attacks.

1 Like

Sorry, I explained badly. I wasn’t asking about reproducibility in its own right, I was asking about whether there was still sufficient demand for PEP 665, in its new form.

The original justification for PEP 665 was that “people want lockfiles” - this is pretty obvious, we have projects like pip-tools, poetry, pipenv, etc, all providing some form of lockfile functionality, so there’s clearly a need.

The complexities of supporting sdists in line with the goal of reproducibility meant that the PEP dropped the idea of supporting them, and focused solely on wheels. That’s fine, but it means that many of the users who we’d previously assumed would benefit from PEP 665 will no longer be able to use it. What proportion? I don’t know, and that’s essentially what I’m asking.

We can’t assume that people using a PyPI mirror will be able to use the new PEP, as they will be mirroring sdists as well, and may well need to install them. That applies to any cases of people concerned about supply chain attacks - it’s possible to mitigate the risk while still using sdists, and it’s entirely possible that solutions which block sdists won’t be acceptable in all cases.

OK, let’s put the question to an informal poll:

  • PEP 665 without wheel support is sufficient for my use cases
  • I won’t be able to use PEP 665 until sdist support is added
  • I will use PEP 665 without sdists, but I will need to handle sdists manually for my workflow
  • I don’t need PEP 665, I use existing solutions and am happy with them
  • Lock files don’t matter to me, and/or I have no opinion

0 voters

One major problem with relying on votes is that “people following the PEP discussion” isn’t a particularly representative group. If someone wants to reach out to the wider community for feedback, that would be helpful here.

I just noticed, I inadvertently worded the “PEP 665 is sufficient” option as “PEP 665 without wheel support is sufficient for my use cases” Aargh - and I can’t edit the poll now :slightly_frowning_face:

I hope it was obvious to everyone that I meant “PEP 665 with wheel support is sufficient for my use cases”. If anyone wants to change their vote, please comment. Or if people think this is sufficient of a mistake that I should restart the poll, then I’m happy to do that too.

Sorry for the error.

How long are you planning to leave this poll open, @pf_moore ?

1 Like

I hadn’t really thought about it, I was intending to wait until voting died down and take a view then. How about we say a week?

1 Like

For those of you who do support PEP 665, without vocal support for the PEP as-is or getting sdists added to get the “sdist vote”, this PEP is heading towards rejection.

As such, for those of you voting for needing sdist support, please see A file format to list Python dependencies of an application without strict reproducibility guarantees and provide feedback there.

1 Like

I voted for the won’t use until sdist support, but also don’t feel a strong place to comment on the other thread. Reading the other thread on sdist extension it feels like a good fit for my use case. My main goal is there exists some input source whether it’s same file or a different file I’m ambivalent on for specifying sdists. The main thing I want is a clear file or two that will allow assuming locker/installer to do,

-e relative_path1
-e relative_path2

wheel_dep1
wheel_dep2
wheel_dep3

The new design with url specifying file directory and editable looks fine. It looks weird that editable is under hashes as naming wise I don’t associate the two, but that’s minor aesthetic thing.

Also looks quite verbose of a way to specify requirements. I enjoy my current requirements.in file that looks like,

-e file:.
-e file:lib1
-e file:libs/lib2
-e file:libs/lib3
-e file:libs/lib4
-e file:lib5

I think verbosity/complexity of writing a simple file like that will slow adoption some. It’s still a strong fit for my team’s usage pattern that I think is beneficial and I plan to migrate when ready, but simple input file format converted to a more detailed file format would be preferable flow vs writing this,

[lib1.lib1."1.5"]
url = "lib1"
filename = ""
hashes.editable = True

[lib2.lib2."1.5"]
url = "libs/lib2"
filename = ""
hashes.editable = True
...

This may also just be confusion. Is pypin file I’m supposed to write as a user or will it be generated similar to the pylock file too? If pypin file is generated too is locker now defined as producer of both pypins and pylocks from custom input file?

1 Like

I would assume a pin file would be generated.

Probably.

1 Like

Should we start a clean poll with agreed questions? The poll is quite confusing to me when I read it and it can easily lead to mistaken votes.

I would suggest first agreeing on the questions. My suggestions.

  • PEP 665 only supporting wheels is not usable at all for my needs without sdist support
  • PEP 665 only supporting wheels is useful as a starting point and I will workaround lack of sdist support
  • PEP 665 only supporting wheels meets all my current needs
  • Abstain
1 Like

Please understand that the poll is informal, and is purely to collect feedback from people who want an easier way of responding than posting a comment. If you aren’t sure how to respond, feel free to say so and state your view as a comment.

Your questions seem to match mine pretty well (although your wording is probably slightly clearer). So if the questions you propose had been used, what would you have replied? For reference, I’d map your questions to the ones given as:

  • PEP 665 only supporting wheels is not usable at all for my needs without sdist support → I won’t be able to use PEP 665 until sdist support is added
  • PEP 665 only supporting wheels is useful as a starting point and I will workaround lack of sdist support → I will use PEP 665 without sdists, but I will need to handle sdists manually for my workflow
  • PEP 665 only supporting wheels meets all my current needs → PEP 665 with wheel support is sufficient for my use cases (as I noted, I mistyped “without” when I meant “with”, and now Discourse won’t let me change it. My apologies for that. I asked at the time whether it was worth restarting the poll, but no-one said it was.)
  • Abstain → Lock files don’t matter to me, and/or I have no opinion (Actually “abstain” is closer to simply not responding, I added this option in case anyone wanted to explicitly say they don’t care).

You missed out the option of “I don’t need PEP 665, I use existing solutions and am happy with them” (which, if it’s not clear, means someone who uses a tool like pip-tools, pdm or poetry and doesn’t have any issue with their current approach, which is usually a requirements file with fully pinned versions).

I should also say that I’d actually much rather that people comment and contribute to the discussion, rather than just picking an option from the poll. For this PEP (or any PEP) to succeed, it needs community consensus, and since the version of the PEP that dropped sdist support was released, there’s been little or no reaction. No-one has said “cool, that simplifies the implementation and is perfectly sufficient for what I need”, no-one has said “ah well, we can use this and I guess we can install the sdists we need manually for now”, and no one has said “you’re all mad, this is useless without sdists” :slightly_smiling_face: There’s been essentially silence, and it’s hard to interpret that as anything other than “no-one cares” or at best “everyone’s tired of the whole debate at this point”. In spite of its flaws, the poll has at least given some sense of what people think of this version of the PEP.

1 Like

I’ll expand on difference for me between pep 665 having editable support vs handling them manually. If locker is allowed to reject an input file that includes relative paths then specifying I want to lock these 3 editables + dependencies becomes not doable directly with locker/installer. At that point I’m forced to construct a tool that constructs file of all top level dependencies of the sdists I wanted to lock. This tool for me would be read pyproject.toml/setup.cfg files of each editable dependency, output a new file that is locker’s input, and then run the locker on that. How will this custom tool handle extras logic? Do I need to force packages to all use one configuration format (unify to pyproject.toml) or do I need more logic to read various ways of checking dependencies a package declares?

And what do I gain from doing these workarounds vs using pip-compile and turning off hashes? I’m made a small wrapper mypy tool before and even though wrapper was meant to be mostly use mypy it still grew to a few hundred lines to maintain. Sadly pip compile + hashes doesn’t directly work with editables as output locked file is uninstallable with pip. So at the moment my locked file to support editables just avoids hashes.

So the manual workaround path is “doable” but complicates CI and maintenance enough for me as a library producer that if I need workaround for editable support I’ll just skip pep and continue using pip-compile until it handles them directly. My primary job at work is not packaging/repository maintenance. It’s a secondary role I handle for my team, but if I’m going to change our packaging tooling it should be a. easy to do, and b. clearly beneficial. My main reason to migrate to pep 665 locker with editable support would be a real solution for editable + hashes vs dropping all hashes. The secondary reason is maybe ides/other tooling can use lock file for useful recommendations.

1 Like

As a both application and library developer, I’m happy with sdist support being simply building sdists and checking the resultant wheel matches against the hash (if that’s possible). I would likely use a tool to perform that locally, rather than in a proxy index server.

All of our applications depend on projects only available as sdists, so not supporting them at all means I can’t use this PEP. For libraries, I vehemently believe in minimal constraint, and encourage team members to have different versions of dependencies to ensure flexibility.

2 Likes

Much like @brettcannon I work in an environment where supply chain attacks are a real concern and the use of direct PyPI is being locked-down. Requiring dependencies to be taken on “managed wheels only” is sufficient and desirable in our environment. If there is a dependency that is only available as an sdist, it is inspected, built, and added to our index.

The benefit to having lockfiles in pip is that for such a common use-case we don’t have to rely on the variety of different tools in the ecosystem or with individuals who think that pip freeze is the right way to produce a good lockfile.

5 Likes

There also allowed to accept, so I’m not quite sure where your relative path concern comes from?

What do you directly gain? Possibly nothing if the community doesn’t adopt the PEP. But if it does then you get a standard on how to specify what to install.

But I would also urge you to not turn off hashes as that’s a security hole.

Yes, that would be possible. That’s effectively what the pin file proposal at A file format to list Python dependencies of an application without strict reproducibility guarantees is doing.

1 Like

If locker chooses to do something beyond requirements and supports editable/local then yes it’s possible locker may work well for me. It’s hard to really comment on what extra features a locker will support given the inherent uncertainty present.

The security hole has been discussed with security at my company. We use a private pypi already so risk of no hashes is picking the wrong thing in the index. I think main focus of security here is trying to more fully manage/require private pypi as right now it’s only partially required. There’s still some risk there as it’s possible for a bad package to end up private index, but well that’s trade off I make mostly for this issue.

Inconvenience of lack of editables is bad given my departments workflow. It caused a number of “bug” reports and ends up with support toil. I’m partly biased against it as I handle most packaging related support issues for my broader team. Downside to hashes is my support ticket load goes up. I think most people wouldn’t even bother investigating pypi issues/peps to see if there’s a way to make security + editables happy and would just pick convenient option.

I think this will be my last comment as it’ll likely be in circles beyond that. I entirely understand there are use cases/workflows where lack of editables is easy to work around and can be beneficial for them. I think for my current workflow the work needed is not really worth investing in for me without some form of editables.

1 Like

If no one has anything else to say either on this thread or A file format to list Python dependencies of an application without strict reproducibility guarantees, I will put this PEP up for pronouncement on Monday, Dec 13 which is 7 days after the last comment as I write this.

1 Like

Thanks for the heads up. It will take me some time to review the discussions and the PEP, particularly with the holidays coming up, so please accept my apologies in advance for any delay in making a final pronouncement.

One point I do want to make right now, though: the PEP says “An example locker and installer will be provided before this PEP is fully accepted (although this is not a necessarily a requirement for conditional acceptance)”. I’m not a great fan of conditional acceptance, so I’d encourage the PEP authors to start on the reference implementation ASAP[1] - I’m not entirely clear how the authors think I should decide if there’s no reference implementation, and I’m not willing to conditionally accept the PEP… :slightly_frowning_face:

Edit: Ignore this, I got myself confused between “conditional” and “provisional” acceptance. I’m happy with the idea that I accept the PEP, but it remains in “Draft” status until the reference implementation is complete, and only gets marked as “Final” (and hence becomes a standard) once that’s done.


  1. I do want a reference implementation in place before final acceptance, if only because the PEP promises that there will be one. It should cover both installing and locking (the latter can be creating a lockfile from a fully-pinned requirements file, I don’t expect a reference implementation to include a full resolver). ↩︎

1 Like

Of course! There’s no specific rush on my end. The deadline was more about making sure everyone had provided the feedback they wanted to.

And even if the answer is, “rejection unless”, at least that will tell us exactly what’s necessary to move this forward (whether that’s the pin file idea from @uranusjr or @sbidoul has an idea next month, or something else as my guess is it will come down to sdist support).

:+1: As I think I have said before, I will write an installer (probably in mousebender) if this gets accepted. As for a locker, my guess is I would write a tool to convert a pip-tools requirements.txt file to a lock file as a bootstrapping mechanism.

2 Likes

Sorry to nitpick, but as I didn’t see an explicit request yesterday, did you intend this to imply that you now consider the PEP to be submitted for approval?

My main concern is just to be clear to everyone that once the PEP is submitted, it shouldn’t then be changed, and further discussion here may not get addressed if I don’t notice it.

@brettcannon what kind of ideas do you expect from me? My ideas are in Supporting sdists and source trees in PEP 665 - #65 but have not really been discussed and now that thread is locked (for reasons I don’t quite understand). I’m still happy to discuss, but as I mentioned I don’t have enough time at hands to engage in PEP writing in the foreseable future.

1 Like