In a couple of the recent threads, the idea was mentioned of considering where we hope Python packaging to be in 10 years[1]:
Unsurprisingly, I agree that this is the core of the issue. Personally I think it is the core of every packaging issue. So I wanted to pull together some of what a couple other people said on this topic and make a new thread where hopefully we can discuss this and its implications for packaging-related decisions in the present and future, but without it being perceived as an intrusion on more specific discussions. I’m very interested to hear what other people’s 10-year (or 20-year or whatever) visions are for Python packaging, or if they think it doesn’t even make sense to worry about such things.
Like I’ve said before, I think this big picture view is needed and that incremental progress through PEPs considered in isolation will likely not produce the kind of qualitative change that many Python users want. However, I do want to clarify one thing: I don’t mean that incremental changes in themselves are pointless. Rather, what I mean is that, if we ever want to solve the fragmentation problem, we must consider incremental changes in terms of how they move us toward where we want to be, not just how they move us away from where we currently are, and not just whether they move us to a situation that is slightly better than where we were before. What is important is not that changes be big vs. small or sweeping vs. incremental, but that they be coherently directed towards a target.[2] Of course, to do that we need to know where we want to be, and I’m hoping in this discussion we can share our views on that.
To begin with, @jeanas gave a concise vision in the same post:
My own vision is similar.
This also has some similarities to @johnthagen’s post several months ago which spawned a wide-ranging discussion about packaging “vision”. That was more of an extended “user story” than a bullet-point list, so I’ll just quote a small section here:
So this post similarly described a single tool that handled environment creation, package install, project management, publishing, etc. It did separate that hypothetical tool from another that would install Python versions, which differs from @jeanas’s (and my own) vision, in which a single tool would handle those tasks as well.
What interests me here is these conceptions people have of where we want to be in 10 (or however many) years, and what is similar and different among them, and whether we can synthesize the opinions of multiple people into something that could become an overall goal for Python packaging (perhaps for the proposed Packaging Council).
I realize everyone is busy with concrete matters, and some perceive talking about this kind of overall trajectory as a waste of time. But what I would be grateful to hear from anyone and everyone involved in all these packaging discussions is:
- Do you agree with the above goals? What are areas of disagreement?
- Do you think it is valuable, when evaluating proposed changes to the packaging landscape, to consider how they do or do not move us toward such a situation? Why or why not?
For something more specific:
Not by a long shot, as conda has done this for many years. Conda currently can handle most of the tasks on the list, with the exception of managing projects[3] and lock files.[4] There are a couple gray areas (e.g., conda itself does not run the REPL, rather you install Python and use that to run the REPL, although you could in theory use conda run
instead to get a REPL if you want).
As I’ve mentioned before, I see conda as much closer to my eventual vision of what Python packaging would look like than the PyPA packaging ecosystem[5], largely because it does combine so much functionality into a single tool. In previous discussions the main problems people seemed to have with conda were:
- it doesn’t use PyPI
- it relies on activating environments
- it “takes control” of the environment so cannot be used with a Python you get from somewhere else
As to #1, I’m very curious what substantive advantages people think PyPI has over alternative package repositories, other than the fact that it exists and a lot of people use it. As I’ve mentioned on other threads, I think many users (especially those who lament the state of Python packaging) have no particular attachment to PyPI and would be fine with something else as long as it provided the packages they need, and as long as the transition process wasn’t too arduous.
#2 is only partially true, as conda run
allows running a program inside an environment without activating it. It’s fair to say this functionality has had some bugs, and may still have some, but it’s a long way from nothing.
#3 is maybe the most interesting to me since, as I’ve said repeatedly, I consider this an advantage, not a disadvantage. In my mind the only way we’re going to get to the world @jeanas described (in particular, “install Python and manage Python versions”) is if all Python use happens via a singular tool[6]. The fact that Python can be launched from so many different launchpads, so to speak, is part of what makes it hard for users to navigate (e.g., the various snafus with Debian or with setting the “default Python” on Windows). It would be easier if there were a single way in, so that once you’re in, you know that all Python-related tasks will be performed in exactly the same manner.
So again, I’m very curious as to what people’s perspectives are on this. What does the “one tool” of the future look like? Is it similar to pip? Is it similar to conda? Is it different from everything we have now?
Is there an irreconciliable difference on point 3, between those who want to keep environment management “inside” a particular Python install and those who want to put Python inside the environment, or can the gap somehow be bridged? Are there actual features of PyPI that are desirable[7], or do we just want to use it because it’s called PyPI and that’s what we’ve been using? Are there other problems people have with the “conda way” of doing things? Are there ways to combine the best aspects of multiple worlds? What other desiderata do people have for a tool that would meet a broader subset (dare I say all?) of their needs than the existing ones?
On another thread, @pf_moore addressed different aspects of the future of packaging[8]:
This is an appealing vision too, where packaging is transparent. It reminds me of what we’ve seen in other kinds of software: there used to be a much sharper boundary between what happened on your local machine and a point where you had to “go to the internet” to get something. But nowadays a lot of software happily pulls from local and remote alike as needed, without the user having to control or even be aware of that. I haven’t thought about this much with regard to Python and I’m not sure what form it would take.
Paul also said:
Again I’m curious what others think about this. My own view is that it’s potentially compatible with @jeanas’s outline earlier.[9] It just seems to me that @pf_moore is describing a more high-level overview, in which the “one tool” @jeanas describes might be a particular “low-level implementation detail” that underpins the “workflow tools”. So perhaps these are two views of the same future landscape from two vantage points.
I’m not entirely sure, however, what it would be like to have a single “the workflow” which everyone uses the same way despite using different tools. It’s possible that inevitably such tools will tend to diverge more substantially and in effect create distinct workflows. Even in the world of editors, although all perform essentially the same function, the differences can become relevant at times (e.g., a couple mentions in PEP 722 discussion about whether editors can block-comment an entire section).
I would love to hear from others about how they compare these two visions — along with their own preferred vision, of course. Is it necessary to have “one tool” at a low level to facilitate compatibility between a broader range of higher-level tools? Is it possible to get that level of seamless transition between tools solely via protocols? What would be the common elements of the “workflow” and which would differ among tools?
This brings us back to what I said at the beginning of this long post. No doubt we need to try things. My perspective is just that, before trying anything, we should evaluate it not only as a step in itself, but also based on how we foresee it fitting into a unified vision of the Python packaging landscape as a whole. Without doing that we risk improving things gradually and thinking progress is being made, yet not making contact with the deeper problems. It’s sort of like, imagine I’m on a beach and I have some tools and materials to build things. Maybe I can build some superb bicycles that allow me to travel along the coast, perhaps exploring some peninsulas stretching out into the sea. But if the place I’m trying to get to is an island offshore, it doesn’t matter how good my bicycles are, I need to decide to work on boats if I ever want to get there. So when I’m evaluating something I built or am considering building, the first and most important question is not “is this well-built” or “can I get somewhere with this” but “can this travel across water”.
Of course, whether you think you need to build a boat depends on whether you think the destination is on the same land mass or a different one. That’s why I really hope some others will share their own such visions on this thread, because whether we communicate those views or not, I have no doubt that they do implicitly influence our stances on each proposal that comes up.[10] It’s just easier to talk about these things if we know where everyone is coming from.
The quotes I’ve included here came from other discussions, which in some cases means they have a little connective text at the beginning or end that may seem out of place, along the lines of “This is off-topic but…” or “but apart from that…”. I’ve tried to keep the main intent of the quotes clear. ↩︎
It’s my own fault if I’ve been misunderstood on this. ↩︎
I’m assuming this means something like Poetry where you can install a library and simultaneously update your pyproject.toml to list it among the dependencies ↩︎
There is a library called conda-lock that supposedly manages lockfiles, although I haven’t explored it myself. With recent conda plugin developments it’s possible something like this could be integrated as a conda subcommand. ↩︎
basically the pip/venv combination, with PyPI as the package repository, together with something like pyenv for managing versions ↩︎
with the possible exception of things like embedded Python, which already often require steps beyond what an average user would ever contemplate (such as compiling your own Python) ↩︎
in the sense that, if we were now deciding from scratch between PyPI and an alternative repository architecture, we would choose PyPI because of those features ↩︎
I’ve only quoted a few excerpts here and the original post has more detail. ↩︎
To be clear, these two “visions” were in different threads, so they were not framed in direct response to one another. ↩︎
For instance, the current PyPA model assumes that environments are created by Python (e.g., with venv) and packages are installed by Python (via pip), which precludes managing the Python version as part of the environment. I see that as an ocean, or at least some very large body of water, which needs to be crossed, and this influences my tendency to see any proposal that continues to build on that model as something that won’t hold up in the long run. I gather that other people don’t see that as such a big problem, and so they are more comfortable with proposals that maintain that assumption. ↩︎