Iâve not been able to follow the discussion here super closely due to time constraints, but I wanted to weigh in a little bit with my thoughts.
The tl;dr here is that I think what we need is a decision making body that has the following properties:
- An ambiguous, but ambitious mission.
- Broad powers to guide the ecosystem in pursuit of that mission.
- An assumption of good faith that the decision making body will generally be doing the right things, and applying their power reasonably.
- An escape hatch in case the above isnât true.
Which I think should generally take the shape of an elected council, with a mission that is something like:
Own, Define, and Improve the Python Packaging Story and Toolchain.
I think their broad powers should stem from being the ultimate owners of the definition of what the Python packaging story is, including ownership of the default set of tools. The assumption is good faith would of course have to come from us, giving them broad powers without trying to rules lawyer specifics about what we expect to see from them or what they are and are not allowed to do, and the escape hatch would be some sort of no confidence / removal procedure.
Thereâs a lot that goes into why I think this is the correct thing, and I guess Iâm turning into an old man because I feel compelled to tell a story about âback in my dayâ here, but I think the easiest way to understand why I think this is to understand the perspective that Iâm coming from.
Iâve been involved in Pythonâs packaging ecosystem for⌠well over a decade at this point (and of course many people have been involved longer), and the world that existed when I first got started was very different than the one we have today, but in many ways I think the same underlying problem still exists and is still plaguing us today.
The packaging landscape back in the early 2010s was very different. The historical âpackaging toolâ was distutils, which is where the interfaces that still exist today originally got defined. However, distutils was closer to Make than a package manager, so we had setuptools which had brought in easy_install
, the ability to define dependencies, locate them on PyPI, download them, and install them. This had come out of the needs of a single project at the time, and attempted to be a drop in replacement for distutils (with a bootstrapping story of "stick this other .py
file in your sdist and import it and run it).
Many projects had shifted to using setuptools and easy_install, but the way that setuptools/easy_install worked from an end user point of view wasnât how many people wanted to work, so virtualenv and pip came along as a competitor to easy_install
and gained popularity.
Eventually maintenance of setuptools had dropped off, and itâs problems (and of course the problems of distutils) were becoming more and more apparent and problematic. However, since maintenance had dropped off, it was difficult to get any changes made to setuptools, so it eventually got forked into âdistributeâ which further split the ecosystem between distutils, setuptools, and now distribute. There were also things like numpy.distutils
and such which existed, but were much less widely used.
There was of course also PyPI, which was also quite different back then, but mostly in ways that donât matter here.
Now, thatâs the rough state of the tooling in the early 2010s (thatâs not every single tool, but those were the major players), but a very important missing element here is that all of these various tools were being built and maintained by completely different people, but they all were very interrelated to each other and depended on each other (including down to implementation details) which made the entire toolchain fragile and hard to change.
Worse than that though, is that the landscape that these tools existed in was very adversarial. At the time discussion around packaging largely took place on two mailing lists, distutils-sig and catalog-sig, which technically were separate lists but for our purposes weâll just lump them both together as âdistutils-sigâ.
The distutils-sig mailing list was, at the time, widely considered one of the most toxic places in the Python community. Iâve done a lot of reflection as to why it was the way it was over the years, and I think you can distill it down into:
- Packaging affected almost everyone in some way.
- The current state was painful for almost everyone, but often times in widely disparate ways, so everyone had an opinion, but those opinions were very different from person to person.
- There was no decision making process what so ever. Python had PEPs, but this was in the BDFL era (and if I remember correctly, it predates the idea of a PEP/BDFL Delegate, so every PEP had to have Guido as the decision maker, and Guido historically had little interest in packaging.
This created a situation where nobody was happy with the status quo, but opinions were very split on what the right path forward was, with no clear mechanism for resolving this stale mate. So every time someone had an idea of how to improve something, the process looked something like:
- Post their idea on distutils-sig.
- Some people would comment in support of it, some people would argue against it .
- The discussion would peter out, because without unanimous consent we had nobody to decide yes or no.
- People would get frustrated and leave and/or stick around and take it out on the next person to post an idea.
In this landscape, a few packaging PEPs had managed to make it through (some of which we eventually did adopt and start to use!) but they were mostly ineffectual, because there was nobody had ever agreed to use the PEP process for this purpose, so the maintainers of those tools largely just ignored those PEPs.
So this is our world, no single tool, in fact multiple competing tools, no communication or ability to coordinate between them, no decision making process besides unanimous consent, which was all but impossible to get, and the only thing resembling a decision making process that could be used, the PEP process, is being wholly ignored.
What ended up breaking this cycle was a a combination of a few things:
- A few people refusing to accept (3) for their proposed changes, and just stubbornly refusing to go away.
- The introduction (or possibly it was formalization, my memory is sketchy) of the BDFL-Delegate system to allow someone other than Guido to decide on PEPs.
- The people involved in, and running, a few key projects (PyPI, pip, and setuptools once Jason took over) all just sort of implicitly deciding that we needed to break the log jam, and that we were going to use and follow standards.
That brought us to where we are today, which (and Iâm biased of course), is such a significant improvement over the world of of the early 2010s in ways that are hard to fully articulate, but which still has a lot of the same fundamental problems. Or rather, in my opinion, still has the same fundamental problem.
See the problem back then was we didnât have a mechanism for making broad, ecosystem wide decisions, and well, thatâs still our problem today. Weâre in a better position today than we were, because we have a process for making some kinds of these decisions, but other kinds weâre still limited to the âpost your idea, people comment in support or opposition, the discussion peters out and goes nowhere because thereâs no decision making process besides unanimous consent, which weâre never going to getâ experience.
I personally think the standards process weâve settled on has been very successful, and I think in large parts thatâs because it roughly follows the properties I laid out in the start of this post:
- An ambiguous, but ambitious mission.
- Broad powers to guide the ecosystem in pursuit of that mission.
- An assumption of good faith that the decision making body will generally be doing the right things, and applying their power reasonably.
- An escape hatch in case the above isnât true.
At the time, bringing all of these projects together and getting any agreement on standards was an ambiguous, and ambitious mission.
The powers of the two PEP delegates are also very broadly defined here, we have âAnything to do with PyPI or the Repository APIâ and âAnything to do with interoperability standardsâ. We never sat down and tried to narrowly iterate what @pf_moore or my powers were in those roles, or even exactly what either of us would attempt to accomplish by our decisions in those roles, there was just an assumption that we were both reasonable people who cared about this ecosystem, and we would generally do our best to make good decisions. Ultimately though, weâve always had an escape hatch incase that wasnât true, projects could refuse to implement something or we could get the standing delegation removed.
Even now, thereâs nothing thatâs stopping us from just saying that these broader, non standards questions that weâve run into donât just fall under one of the existing delegations, except that the people involved are all reasonable people who generally agree that that is further than the general consensus of what these roles are currently, so we donât do it.
There is an important, but subtle bit of power dynamic at play here too. Itâs not a mistaken that I am the standing delegate for PyPI PEPs and Paul is the standing delegate for packaging standards PEPs. Iâm one of the core developers for PyPI (and for a time, I was the core developer for PyPI) while Paul is one of the core developers for pip. A lot of the implicit power that resides in our current decision making process stems from that, chances are really high that both PyPI and pip are going to implement and/or enforce whatever decisions we make, because in most cases the people making those decisions are also in the position to enforce them.
This bit of subtle power dynamic I think is a big part of why the system we have was able to exist and actually work (remember, there were other packaging PEPs that largely got ignored prior to all of this, and I think thatâs in part because those PEPs lacked this power dynamic), and I think that itâs important that we recognize that, and try to extend that to whatever new system we come up with (if we do in fact come up with something).
But I also think there is a bit of a double edged sword here with that, because I think we need to be careful that we donât create a world where people have extra incentives to avoid making hard decisions. To use myself as an example, if someone came up with a PEP that completely replaced the idea of PyPI with something else, that is something where I would be incentivized to say no, even if saying yes is the right thing, for a variety of very human reasons.
That was a lot of words, but hopefully it provides at least an interesting, if not useful perspective.
To reiterate from above, and provide some more detail, hereâs what I think we want:
- An elected council.
- With ambiguous goals and broad powers.
- The âbuck stops hereâ owners for the âdefault toolingâ.
- A defined escape hatch.
I think that singular, long standing delegations work reasonably good for technical decisions, but they have problems with larger questions. Weâre somewhat limited to a direction that either myself or Paul are generally OK with, even if weâre wrong and the community thinks we should be going in a different direction.
Moving to an elected council means that ultimately the community is in charge of the direction we take packaging, through the nature of being able to elect people to the council who want to shape things in the way the electorate wants.
Ambiguous goals and broad powers means that we donât have to try and flesh out all of the answers right now. I see questions like:
- Should the default tooling integration with X ecosystem (Linux, Conda, whatever) or even any ecosystem?
- Should we have a singular tool that does everything or is individual tools and/or focusing on user choice the right approach?
- Do we create our own âplatformâ and do something like PyBI and start to push things so binary wheels prioritize targeting a PyBI like platform?
- Which tools/projects make up the âdefault toolchainâ?
As the wrong level of detail for what we put into the governance, because if we could define those already, we wouldnât actually need this at all, but also because the answers for those things may change over time and ambiguity here allows the council to evolve over time to meet the needs of people in a decade from now.
In other words, all we really need to define is that we want this council to have the powers to decide on these kinds of things, and then let the voting and council set the exact questions that need answered.
I think it is important that the council âownsâ the default tooling, whether they are involved in the day to day maintenance of it. This is probably one of the âscarierâ things in this, but I think that people tend to worry about the worst case possibilities too much. What does it mean for the council to own a project? It means that ultimately they can override decisions that the day to day maintainers make, including removing those maintainers.
Of course we live in the real world, and these tools are largely maintained by volunteers, so there are limits to what the council can dictate, but I donât think we need to actually sit here and try and come up with some sort of legal document or exhaustively document every possible way this fundamental contradiction might expose itself. The council are not employing these maintainers nor are these maintainers slaves to the council, so at the end of the day if the council wants something done, and a maintainer doesnât want to or canât do it, the council will have to find another way to make it happen, but the key thing is that they have the power to make it happen.
I donât want to pick on setuptools, because I donât think that the recent situation was exactly setuptools fault , nor do I think the particular problem is nearly egregious enough to warrant something particularly aggressive from the council, but just as an example with the recent issue around name normalization. If we had the council at that point, they could have simply wrote their own PR to setuptools to bring it in line with the standard as it existed now, and merged it against the wishes of the setuptools maintainers.
Of course, doing anything against the wishes of the maintainers risks angering, and maybe even losing those maintainers, but I donât think that we need to spell that out, because I think the protection against that is the assumption of good faith in the part of the council, and the ability to remove them if they are violating that good faith.
I think that we can assume that anyone elected to this council will be aware of that, and we expect them to generally use this power rarely, if ever at all, but I think it is important that they have this power, because I think thatâs in part the power dynamic that the rest of their softer power flows from (much like how currently that implicit power exists through nature of Paul and myself positions within pip and PyPI).
I think it would be reasonable to explicitly call out that we expect the council to operate by building consensus and incentivizing behavior rather than top down dictation wherever possible, so called âsoftâ power, and using their âhardâ power sparingly, as a method of last resort.
In my mind, I think the hardest question we have to answer right now, because think all the other really hard questions should be left up to the council, is who are the voters who vote on the council, because that is going to have a large impact over who has the power to set the direction of the ecosystem.
In theory I would love it if we had fully open elections, anyone who cares enough can vote. Unfortunately in practice open elections on the internet donât really work, and you need some mechanism for defining a voter roll.
My second preference was that we use something like the PSFâs contributing members, where people can self certify as meeting some requirements, but Iâm told that managing that process for the PSF is a nightmare, and logically it does essentially boil down to open elections unless you have someone continuously verifying all of the certifications since anyone can self certify, whether they actually meet the requirements or not.
Where I personally ended up landing (and this may not be the right thing either!) is something similiar to the CPython Core Developer process, which is basically âan existing core dev nominates you, with a time limited vote amongst the existing members, and if a majority votes yes, then youâre a voting memberâ. I think we would be able to just explicitly say that anyone who âparticipates or is involved in building the packaging ecosystemâ would be eligible, which we would define to include things like people who are core devs of packaging tools (whether official or not), people who participate in the packaging discussions on discourse, people who educate users about packaging tools, etc.
Itâs up in the air whether being a core dev for a âdefaultâ tool would imply automatic voting member or not, I suspect it doesnât really matter because I doubt weâd get many No votes for anyone who is actually involved, so it may be simpler to just not imply automatic voting rights and that it just means youâre eligible to ask to be a voter.
Of course we would need an initial list of voters to seed the member list with, and I would say we can just use the existing list of PyPA committers for that, maybe with a period of time before the first election for non PyPA committers to get a chance to ask to be members and get voted on.
One last bit in this already way too long post, I think it may be a smart idea to go about this in a slightly different way.
Right now, the process for migrating from where we are, to this hypothetical new world is to essentially âweâre going to redefine what the PyPA isâ.
I honestly think thatâs technically fine, but I think there may be some optics and political reasons to switch that around, and instead say "weâre going to dissolve the PyPA, and create a packaging council ", and projects currently under the PyPA will need to move to be their own projects, and if the council decides to define a default toolchain, they will need to either create it or work with the tools they want to adopt to own it.
This has a few nice benefits:
- We drop the historical baggage of the PyPA, particularly around the idea that the PyPA cares about PyPA tools and not other tools, what the default tooling is and cares about is TBD by the council.
- We donât blanket assume that every project that agreed to be a member of the PyPA under the current rules, are OK with giving up some level of control to the new council.
- We also donât assume that every project we ever added to the PyPA should be part of, or owned by the new packaging council, some of them likely shouldnât be but there are awkward conversations that would have to happen if we the default was all PyPA projects become council projects, if the council wants to not adopt some of them.
However, it does have one big risk:
I think that the council will have to live in reality where there currently is a default toolchain, whether they agree with that toolchain or not, which means there are a set of tools that should exist in the council right from the start if we want this all to work (PyPI for instance is a big one, likely pip as well, probably need a build backend or two too).
The question then becomes, if we decide we want this, do we setup all of the machinations for electing the council, elect them, etc, and then risk having the de facto set of âdefaultâ tools decide not to accept council ownership? Or do we try and define what the âinitialâ de facto toolchain is, and make sure that those projects are OK with it as part of moving forward.
Personally I would suggest defining a minimal set of tools that make up the initial de facto toolchain (PyPI and pip almost certainly, and we would need to survey what build backends are being used but I would guess setuptools is still the most widely used one and makes sense, but thatâs just a guess). Make sure those tools are OK with being owned by the council, and then saying everything else is TBD by the council.
Over time, if the council decides that those tools arenât the right tools to continue to be the default, then they will be able to set that direction, sponsor or adopt a different tool, and deal with the process of how to migrate a community, and then ultimately retire (likely with the option for the existing maintainers to take over ownership outside of the council if they want to).
That was a lot, sorry Hopefully folks found it interesting and/or useful!