Python Packaging Strategy Discussion - Part 1

rgommers · January 19, 2023, 11:32pm

I also had in mind the ^ operator for Poetry, and IIRC both Poetry and PDM propagating an upper bound in python_requires in an undesirable way.

The exact details don’t matter much here though, whole blog posts have been written about this with more detailed comparisons. I could have gotten a couple more details wrong. The main point is that these tools are on the right track, and having them converge/merge/cooperate is the most viable way I see to resolve this “what workflow tool to standardize on” question.

agoose77 · January 20, 2023, 12:33am

Small point – pdm doesn’t do this by default, it uses minimum for the lower bound, i.e. package >= X.Y.Z. However, I know this because I wrote the patch, but the docs don’t reflect this, so I’m glad you pointed it out (I probably missed them!)

Gah, this quoted the wrong part of your comment, sorry.

frostming · January 20, 2023, 12:37am

I would prefer a boilerplate test script in tool.pdm.scripts which can be changed or removed later, rather than a standardized test command which implies a unified test framework like go and Cargo. Thanks to the PDM script shortcuts, pdm test is equivalent to pdm run test.

Not for PDM. PDM is using the minimum(>=) strategy by default.

Yes, but you seem to quote the wrong part.

pradyunsg · January 21, 2023, 2:45am

As I’d promised ~200 posts ago, here’s my thoughts on the status quo of the ecosystem as a blog post:

Frankly, I feel like I should post this in a separate thread but I guess we’re going to aim for a record on discuss.python.org with this one (this topic is already the 5th longest in d.p.o history already). I’ve trimmed my whole thing on Conda/native dependencies; coz we’ve talked a ton about that already here.

If you spot something wrong, do let me know outside of this thread (discuss.python.org has DMs)! Right now, I need to go to sleep.

dstufft · January 21, 2023, 3:54am

It’s probably unsurprising that I agree with pretty much everything written in @pradyunsg 's excellent post. In fact, the Github comment of mine he dug up from 2019 could have easily been posted word for word in this discussion and still made complete sense.

One thing I will disagree with on, is this statement:

We still don’t have agreement that this is the direction that we, as a community, want pip to go.

If we include the wider Python community, I think there is heavy agreement on this given that:

The vast bulk of people are still using pip even though tools like poetry, pdm, hatch exist.
By far the number one ask for like 5-10 years now has been to provide a more unified experience.

I can’t find a way to reconcile those two facts with each other, without coming to the conclusion that most of our users area already in agreement that this is the direction they want pip ^[1] to go.

I think where the disconnect is, that the people working on these tools can’t come to agreement that this is the direction we want to go. I think there is a lot of reasons for that, we’re best positioned to understand the nuances of the various tools and to take advantage of the power of the unix philosophy, we’re all working on different tools within the ecosystem and have a vested interest in what’s best for our own projects, dictating ecosystem level, non technical decisions is hard and we’ve been loathe to do that.

Ultimately, I think what it comes down to is we’ve gotten pretty good at using the PEP process to make technical decisions about interoperability standards ^[2], but we’re terrible at using it for anything else and we don’t have a real alternative for making decisions otherwise. That means that anytime we have this discussion, we largely just sit around talking about what we could or should do with no actionable items on how we’re going to come to an agreement.

So here’s what I’m going to propose:

Interested parties write a PEP on how they think we should solve the “unification” problem within some time frame, all of these PEPs will have the same set of PEP-Delegates, the various proposals will be discussed just like any other PEP, then the PEP-Delegates will pick one and that’s the direction we’ll go in ^[3]. If they are unable to come to an agreement, then it will get kicked up to the SC to make a choice.

My recommendation is that we do something a little unorthodox and instead of having a singular PEP-Delegate, we instead have a team of 3 of them, who will together select the direction we go in. My rationale here is that this is our first time making a decision quite like this, it’s going to be a decision with a large impact, and there is no one singular person who could make this decision who isn’t biased in some way.

Unfortunately, since we don’t have a good way to make these decisions, we also don’t have a good way to make a decision on how to make a decision. So I’m going to propose that we wait two weeks, and if there isn’t a large outcry, then we just move forward with this idea.

I don’t think they actually care if it’s pip or not, what they actually care is that the tool that comes with Python should go in this direction. Obviously that tool is currently pip, so by that nature they have agreement that it should be pip. Of course another option is that we get agreement from Python Core that another tool should be bundled within Python. ↩︎
Also for large decisions for PyPI itself, but that’s not related. ↩︎
Obviously we have no real enforcement mechanism here, if some project disagrees with the ultimate choice made, they’re free to go their own way-- and that’s a good thing. We’re not looking to create the one true tool to rule them, just to provide the default tool. Other tools may still exist and people can still use them, but defaults matter. ↩︎

hauntsaninja · January 21, 2023, 5:21am

If maintainers and users want to evolve pip into a tool that does more things, I think a good first step would be to make its interface pluggable. There’s definitely something that appeals to new users about having the same top-level command.

I think this would allow some degree of UI experimentation while still having a single command that users interact with. There’s been comparisons to cargo in this thread; cargo is pluggable and seemingly simple core workflow commands like cargo add did not come in-built with cargo until very recently.

dstufft · January 21, 2023, 5:29am

AIUI cargo isn’t actually pluggable, it just has a small shim that if you run cargo foo, it will look for a binary named cargo-foo, which isn’t meaningfully different than just running cargo-foo. There’s no real extension point other than that that i’m aware of.

CAM-Gerlach · January 21, 2023, 7:05am

From a PEP Editor perspective, if there’s enough consensus in this approach and interest in submitting proposals, we could perhaps consider allocating a series of PEP numbers (like the 3000 and 3100 series for Python 3, and the 8000 and 8100 series for Python governance), with a Process meta-PEP documenting and agreeing upon the process and listing the individual proposals added as PEPs after that one.

Having a council of 3-5 PEP-Delegates certainly sounds ideal to me, if there is a desire to go ahead with this. The only difficulty would be deciding on who would comprise the committee, and deciding on how we’d decide that. The PEP-Delegate selection process as of our latest revision of PEP 1 is just people self-nominate and the SC approve it, or make the decision if there are multiple candidates; beyond that there’s no formal process, so we’d have to invent one.

In this case, we could just have people here nominate candidates, with their approval, and then the SC approve the 3-5 based on our feedback here, or we could have a more formal vote on the thread, or in the PyPA committers group e.g. by single transferable vote (or similar multiple-selection method) (and have the SC formally confirm that, if needed).

IMO, the ideal candidates would be those who:

Are well-respected in the packaging community
Collectively possess a diversity of backgrounds
Lack an existing strong vested interest/established opinion about a particular approach.

That latter point is probably the most difficult, since pretty much the who’s who of packaging have commented in this thread, but at the minimum we can say that anyone accepting a nomination as a PEP-Delegate cannot be an author of any of the PEPs for consideration, and must generally recuse themselves from actively advocating a strong stance for or against any of them (as is the general principle for PEP-Delegates currently). This should hopefully help self-select those who are committed to taking on the role of stepping back to help select that which best captures the consensus of the broader packaging community and serves the needs of Python’s userbase, versus the equally valuable one of being an author or champion of a specific proposal.

I’d personally like to see at least one Delegate with a scientific Python/PyData background, given somewhere around half of Python’s userbase falls into this demographic (including me)—and at least my experience bridging both worlds, it’s among the scientific community that I’ve both heard the loudest and most consistent calls for packaging unification from people of all Python experience levels, and also are hit hardest by the lack thereof, as they are by and large not professional programmers and are most lacking in the background, experience, rationale and institutional knowledge/resources to be able to successfully navigate the current landscape.

Finally, as a PEP editor with experience in packaging specs, logistics/organizing, technical writing, and in equal measure the PyData and core Python realm, and who’s been closely following this discussion from the beginning and been strongly invested in seeing some form of packaging unification come about, but is equally strongly neutral so far on the most workable approach to do so (and have accordingly have just been listening) with no direct ties to any of the tools or proposals involved, I offer any help I can give in an administrative/procedural capacity.

pradyunsg · January 21, 2023, 12:27pm

Honestly, I don’t mind just doing this.

Given how the “plugins to pip” means “let me hook into how pip does things” or “let me write pip compile rather than pip-compile”, and the latter is repeatedly suggested as an good-enough extension model; just doing that might be good-enough TBH.

agoose77 · January 21, 2023, 12:34pm

Given how the “plugins to pip” means “let me hook into how pip does things” or “let me write pip compile rather than pip-compile”, and the latter is repeatedly suggested as an good-enough extension model; just doing that might be good-enough TBH.

This is what we have in the Jupyter ecosystem, and it seemingly works well for us there!

hauntsaninja · January 22, 2023, 1:33am

Yes, that’s my understanding as well (obviously there’s room for more extensibility beyond that). My claim is just that despite being simple minded, for whatever reasons, this kind of thing has brought value to users in other ecosystems and may allow for experimentation on the way to making pip a complete workflow tool.

Connecting this to Pradyun’s post:

implementing vital workflow improvements is now blocked on an exhaustingly long process of a non-iterative, waterfall-style design process

Extensibility is one possible way to allow for more iteration and for workflow extensions to prove their usefulness. I think other ecosystems demonstrate that this is viable, even for simple or core commands.

Unintended competition

I think even extensibility changes where things fall on the “Competitive Spectrum” Pradyun defines. If I work on a pip-audit subcommand I have a “cordial” relationship with the rest of pip and likely a “collaborative” relationship if I work to get it upstreamed. There’s less mutual exclusion when it comes to users.

Some/all of the “workflow tools” that exist today because the “default” tooling did not cover more of the user’s workflow with a single piece […] the folks who created these tools are not fools who like to create work for themselves or enjoy reinventing the wheel

There may be less fighting over users, since all the users are nominally pip users. There might be less wasted effort if contributors can focus on the parts of workflow they feel they can impact the most.

BrenBarn · January 22, 2023, 5:40am

I don’t think they actually care if it’s pip or not, what they actually care is that the tool that comes with Python should go in this direction. Obviously that tool is currently pip, so by that nature they have agreement that it should be pip. Of course another option is that we get agreement from Python Core that another tool should be bundled within Python.

I think you are right that users don’t really care if it’s pip. I think it’s a bit of a stretch to say that they agree it should be pip because that’s the tool that comes with Python. I would rephrase your statement to say that, rather than users thinking that pip should go in any particular direction, they think Python should go in the direction of including a universal toolset. Many users do not even think of pip as separate from Python; pip is just a magic word for “how you install Python packages”, and what they want is a world where when you get Python, you get a unified working system for that. The important thing is that the whole system is a “battery included” with Python.

That also has to include clear and comprehensive docs explaining how to use the tool to do the tasks users want to use. For instance, the current situation with the packaging docs actually having four separate tabs to show how to do the same thing with four different tools is the opposite of what we want (especially because it doesn’t provide any guidance on why you’d be using one or another of those tools).

I think this bit of your Github comment is especially cogent:

This is probably a larger question better suited for discourse, but honestly I think we need to take a holistic view of what our ideal tool looks like. I would try to put the current set of tools out of mind for now, and try to design an experience up front.

One thing that worries me a bit about this thread is that I still don’t see that much discussion of that. There is a lot of talk about hatchling this and pip that but not so much about “these are the tasks that the tool needs to accomplish”. For that reason, I have a bit of trepidation about your proposed solution of competing PEPs, because I think there needs to be some higher-level guard against a decision being made that just results in another tool coming into existence without the requisite oomph behind it to clear the field.

For the record, here are two concrete requirements I would want to see as part of a standard Python installation:

Environment management and package installation need to be designed together, as part of the same system. Whether they are the same CLI tool is not really important, but right now the two-dimensional choice space of environment managers and package installers is a big source of pain.
Environment management needs to include the ability to manage the version of Python itself as part of the environment. Again, currently, the need to do these separately (e.g., managing Python versions with pyenv but package versions with virtualenv) creates a two-dimensional space of tool choices that increases the sense of confusion.

Personally, I would go further and suggest that in the end we need to consider a conceptual shift, where what you get when you “get Python” (e.g., from Python.org) is not actually “just Python” but rather a meta-tool for managing Python. On Windows the Python launcher has some of this, and there’s been interesting talk in some of these threads about a similar launcher for Unix, but again these tools are not integrated with things like virtualenv as part of a coherent system.

h-vetinari · January 22, 2023, 7:04am

It’s the tension between idealism (“let’s sketch out the proper solution from scratch”) and pragmatism (“why reinvent the wheel when we already have all the pieces, more or less”).

Though I yearn for the former^[1], the mood I get from the past 200+ comments is that this space is so convoluted and hard to change that even “just” a pragmatic solution will be an undertaking of a magnitude that’ll constantly skate on the edge of failure, saved if only by key stakeholders keeping the faith / direction / motivation. Or perhaps it’s really just the lack of direction finding mechanism pointed out by @dstufft. On that note…

I actually do think that the “competing PEPs” idea is the most concrete opportunity to effect change in this corner of the Python ecosystem in some time. But like for the 8000-series of PEPs to find a new governance model after Guido stepped down, I think we’d first have to find a reasonable model for them (including the decision-making process as outlined by @CAM-Gerlach above).

Taking for example the list of dimensions for possible unifications by @pradyunsg, each of those dimensions could reasonably have two or more PEP-sized proposals for concrete action, but individually those different choices do not make a comprehensive solution (which needs to look at the space more holistically), so then we’d probably need to tie the overall direction together with meta-PEPs(?).

Without such a distinction / plan, we’d have a whole food stall worth of apple-vs.-orange-vs.-banana choices: it would be difficult to compare PEPs that tackle a single aspect with a PEP that proposes to overhaul the whole space, or similarly, to compare PEPs that describe a generic solution (“we should have a unified tool that does a, b, & c”) vs. a PEP that prescribes a specific solution (“pip should become the unified workflow tool”).

I don’t have clear picture how the process could work, but aside from finding the PPP / “PyPaPa” (Python Packaging Panel), I imagine it could work to explicitly layer the PEPs:

Level 0: Implementation Choice: Realise X in such and such way because $reasons
Level 1: Direction Choice: Go in direction X for problem Y because $reasons
Level 2: Overall Choice: Combine several direction (& ideally implementation) choices into an overarching goal.

The PPP could then decide on the Overall Choice (even if not every last detail in the respective lower levels is worked out), whence all energy can be poured into filling the gaps there and getting to work.

The obvious downside of this approach is that writing all these PEPs would be a massive effort that someone would have to do, and so would be deciding between them. If someone has a better idea of how to structure this I’d be thrilled to hear it.

see the last paragraph of this comment further up ↩︎

steve.dower · January 23, 2023, 12:58pm

I’m broadly in favour of this POV, and realistically the installers from python.org are more than “just Python” already. At least for Windows it includes pip, the launcher, the development files (headers/import libs), bundled dependencies (OpenSSL, libffi, Tcl/Tk) and documentation. The embeddable package is very nearly just Python (as it’s meant for people to redistribute just Python), and the Store package is closer.

However, I’m very skeptical that we can realistically land on a cross-platform approach that satisfies the needs and styles of those platforms (without being literally Conda, and even that’s not satisfactory).

I’ve already been doing some work/prototyping/validation on a possible tool like this for Windows. It’s got potential, but ultimately we’d need to get people to stop downloading the old installer from python.org in order to provide any better workflow, and that’s just not very likely to happen.^[1] Similarly, people would need to stop using their system Python on platforms that have one, which is also a big ask that we’ve been asking for years without significant impact.

Build the tools instead and see which one gets the most popular?

I would volunteer for this. I’m unlikely to put a proposal in (and being a delegate will ensure that I don’t ) but I’ve got involvement in virtually all sides of this (2nd order tooling, data science, local/remote deployment, publishing, security, x-plat). My bias is towards “should make PyCon workshops easier for first-time attendees, not harder”.

At the same time, something about this idea really niggles at me. If it’s the best we come up with, I’m willing to participate, but I just feel like we can come up with something better.

The 500K downloads from the Windows Store in the last month is dwarfed by the probable 20-30 million direct downloads (I don’t have actual numbers for python.org, just an estimate based on the past). ↩︎

BrenBarn · January 23, 2023, 5:46pm

The essence of what I’m suggesting is that Python.org would no longer provide an “old style” installer. The way you get Python from python.org would be a new, manager-centric installer.

Well, that’s the approach that got us to where we are. (Is that what your upside-down face was meant to indicate? )

brettcannon · January 24, 2023, 12:54am

Technically you’re right that’s all external cargo subcommands do, but cargo has a cargo metadata command (with a create to more easily process the output) which provides details about the project for subcommands to leverage.

dstufft · January 24, 2023, 4:28pm

So would pip need to provide something like that to be useful? Maybe I’m too close to the problem, but to me typing pip foo instead of pip-foo, with no other changes doesn’t seem like a meaningful improvement to me, and can cause additional confusion for people who aren’t sure where pip foo is coming from (or if pip itself later adds their own pip foo command).

rgommers · January 24, 2023, 8:35pm

This is apples and oranges. As pointed out before in this thread, there are quite a few situations where you have an “install a package” task at hand, like in CI or in Docker. And then pip is effectively the only game in town. So pretty much everyone has a need for pip, also Poetry/PDM/Hatch users. Conversely, many people simply don’t have a need for a project manager or workflow tool like Poetry/PDM/Hatch. Installers and project managers are different types of tools, so taking the heavier use of one installer as evidence that the community wants that installer to evolve into a workflow tool is a bit illogical.

That is the important part here I’d say. If all the pip maintainers indeed want to go in this direction, then just do it. If you add well thought out new features, users are very likely to be happy.

It seems like working out what PEPs should cover or commit to is needed here. For example:

should a PEP like this come with a prototype before being eligible for submission or acceptance?
What are acceptable time frames to aim for, and should they be similar?
- E.g., PEP 1 could say “standardize on Poetry, which will implement all relevant PEPs in 1 year” while PEP 2 could say “Pip will grow features X, Y and Z over a 3-5 year period” - that’d be hard to compare.

The 3-5 PEP delegates that @CAM-Gerlach proposed seems like a good idea; their first job is probably to figure out what ground PEPs should cover, and other meta aspects here.

brettcannon · January 24, 2023, 8:45pm

Depends on the experience people are after. Git subcommands don’t do much more than make sure a pager is set and obviously make sure git is available on PATH.

But I think the key question is what info does pip have about what it’s working with that would benefit some other tool? For instance, the reason I’m considering this for the Python Launcher is it could provide all the interpreters it knows about or would have selected by default. If you assume pip is always run from the environment it would install into then pip could provide details related to that.

dstufft · January 24, 2023, 9:02pm

I don’t think it’s illogical at all?

By far the main feedback we get, afaict, by an overwhelming amount, is that people want to work with a single tool, they don’t find value in our hodge podge of tools dedicated to specific tasks. Those “unified tools” exist today, some people are using them, but the vast bulk of people are not.

To me that means one of:

Either the existing alternatives are not what people want, so much so that they’d rather not have it at all and continue to use pip.
Most people don’t actually want a workflow tool, we’re just hearing the vocal minority.
Most people don’t actually care what the workflow tool is, they just want it to be there by default and without having to think about which one.

The third of these I think is the most likely, and implicitly that means that the preference is for pip to grow those features. The only other option is they want us to add yet another tool to Python, but then people have to pick between pip and this hypothetical other tool, and having to make a choice is what people seem not to want to do.

So to be clear, I don’t think it’s explicit pip that they want to grow these features, but “the tool shipped with Python”. It just so happens that the tool shipped with Python is pip.