Drawing a line to the scope of Python packaging

(Paul Moore) #61

That sounds right to me. It’s quite possible that one of the initial guides on packaging.python.org could give a more complete and even discussion of the various options available. However:

  1. Users in my experience don’t want unbiased lists of options, they want opinionated guides.
  2. Nobody reads guides when they start off anyway, so you’ll probably still be getting people who’ve made an initial decision.
  3. We’ll always need to ship some packaging tool with Python (if only to avoid the “how to install the packaging tool” problem), so that tool (whether it’s pip or some successor) will always have an advantage as the easiest option to start with.

Nevertheless, better explanation of the available options and their costs and benefits would definitely be good.

1 Like

(Travis E Oliphant) #62

It is very unlikely for Anaconda to do this as this would confuse its customers and users. However, conda-forge could create a Python-installer that had conda-forge as the default. That would be cool to see.

1 Like

(Travis E Oliphant) #63

I completely agree with this framing as well. @msarahan and @pf_moore describe what is needed well.


(Travis E Oliphant) #64

Thank you @steve.dower. This is interesting feedback and in line with what I have seen as well. The fact that “pip” is the easiest thing for the new person to reach for means that all they think about is using pip for install. This invariably means that pip will be pressured to be a general-purpose packaging solution (it will slouch towards it based on what appears to be just following what users want).

The problem is that there is a space for a user-level, cross-language package manager like conda and there always will be. This channel is about drawing a line to the scope of Python packaging. Will pip be used to package and install Python itself? Will pip be used to package and install Java? Julia? I believe the answer should be no.

Then, if that is the case, because Python is used to “glue together” so many other languages there must be language at packaging.python.org that helps people understand that you should not expect ‘pip install’ to be the only way to install every Python package. Perhaps it can be used to install the ‘python-parts’ of the package but some of the things that must be installed for the solution to work should be installed by other package managers.

If we can agree on that framing (or something similar and better articulated), then we can have a conversation. Right now, what I see is that people are making “pip installable” things that make it much more difficult to actually provide a working and reproducible environment using tools that were built for that purpose. I’m not sure why people are doing that rather than build packages using tools that let them install them — other than the branding of packaging.python.org and its apparent message that everything should be “pip installable”

That won’t be able to provide what a user will expect until pip install can also install every other run-time that Python solutions glue together. For example, think about the pip install pyspark that happens right now. What does it do? Does it install Java (which is necessary for it to work)? It doesn’t as far as I can tell. Other Python packages are like this too and should be like this (they need previously installed things in order to work).

Is there a mechanism for pip to check for these previously installed things and raise an error or warning if they aren’t there? Perhaps that is a feature that could be added which would also implicitly help people understand the scope of “pip install”. All I’m suggesting is to do that plus a bit of modification to packaging.python.org in order to point to the efforts of other communities like NumFOCUS and conda-forge that are solving the general-purpose install problem.

Thanks for the feedback and help understanding other points of view. And just in case it’s not clear, I’m incredibly impressed and grateful for all the hard-work that goes into the open-source and community-centric solutions that you are all providing. Please just take my recommendations as a particular point of view from the trenches.




How to help people migrating from pip to conda?
(Brett Cannon) #65

Or to tie in the analogy that @steve.dower brought up, conda works as a view on top of package versions to guarantee they all work together. So if you installed stuff with pip before that conda intall tensorflow then you would break assumptions conda makes about controlling all dependency versions to make sure they work with tensorflow (e.g. conda’s tensorflow might be tied to a different version of numpy than the one you installed or is even available on PyPI).


(Travis E Oliphant) #66

I had to wait until I could post a reponse to this because of the 3-replies limit that the forum puts in place.

Yeah, I think that’s right. More complete and wholistic messaging on packaging.python.org would be best. It doesn’t even have to be conda specific, but it could be one of the package managers mentioned.

And then, I’d like to see the conda-community and PyPA talk more in general. But that is mostly on the conda community at this point.

Thanks for pointing this out. I’m sorry if it came off more strongly than I meant it. Certainly it was not intended to be confrontational. It is true that I was challenging some assumptions people have (but I also welcome people challenging my assumptions). To be clear, it’s not just conda-forge that people can contribute to (brew, nixOS, chocolatey, apt-get, rpm all have packaging communities that would help).

All of my comments, though, are meant in a spirit of conversation and collaboration. I really apologize if it didn’t come off that way. I also recognize people will ultimately have different use-cases and needs and therefore different results. This can be the beauty and robustness of community.

I also believe the PyPA has done an incredible job of improving things in the Python community. I only emphasize that it should continue to be very careful about defining standards and limiting the scope of its standards — especially when there are already other solutions to the problems being solved.

I recognize this is hard for a volunteer community because it is easier to recruit volunteers to do things they like to do (like write code that solves problems they specifically have or someone they know has). Intentionally gathering feedback from people you aren’t hearing from and integrating roadmaps based on personas and as many stake-holders as you can as well as existing technology is what product managers typically try to do. I’m supportive of efforts to fund product managers for open-source communities.

I definitely agree that this is not about an “us vs them” and if I sound like that, I am sorry. Both conda and pip have their uses and while there is overlap they cannot replace each other. In fact, I don’t think they should, but hope that the PyPA understands that little by little pip will need to become a general purpose package manager (thereby enabling people to replace conda entirely) unless it limits its scope.

My repeated suggestion is that some people currently using pip because that is what they are told to use by the PyPA would be better served by using a general-purpose packaging solution like conda (or spack or brew or yum or apt-get or …) and that is a useful thing for the PyPA to acknowledge on packaging.python.org.

Thanks for the feedback.

1 Like

(Paul Moore) #67

I think the answer to that is clearly and self-evidently “no”. I don’t think anyone imagines otherwise - although it’s possible that not everyone draws the same conclusion from that inference that you’re suggesting.

I think that’s a fair suggestion in isolation. But it still avoids the question of how tools work together. If I’ve installed Python using the system package manager (on Windows, the python.org installer) and then used pipenv to set up my application development workflow, and as a consequence used pip to install several packages, and I now want to install something that “should be installed by another package manager” (which may be conda, but could be something else that sits in the same space as conda), we’re currently in a position where that’s not possible, and the developer has to unwind all the way back to installing Python with conda, and then looking for a conda-compatible equivalent to pipenv for their workflow. My contention is that doing so isn’t a practical option for the majority of people, and so we have 2 options:

  1. Tell those people that they can’t use the package that they are trying to use.
  2. Offer an option to use that package via pip (or any other toolchain that does work without that “rewind”) - probably with caveats that there may be integration issues, and those will be down to the user to address for themselves as their chosen toolchain doesn’t have the means to manage the issues automatically.

The problem I have is that at the moment we’re talking about helping people to make informed decisions right at the start of their projects, but glossing over the fact that the technical limitations of the tools mean that we’re expecting them to make difficult-to-change decisions before they have the information needed (specifically, what packages will they be using) to actually make those decisions.

IMO, advising new users to start with pip is only a problem if we don’t have a gradual migration path to tools that address more advanced issues. And the struggle with conda is that it doesn’t have that gradual migration path - so it looks like there’s a bias against conda, when in actual fact all there is, is an acknowledgement that new users may not need something as powerful as conda yet.

I’m afraid I think this is an issue that the conda community really need to solve themselves - how to provide a more gradual migration path for existing pip/pipenv/poetry/etc users. Once that path exists, I think that documenting and promoting it would be very easy to integrate into packaging.python.org.

1 Like

(Brett Cannon) split this topic #68

3 posts were split to a new topic: How to help people migrating from pip to conda?


(Jeroen Demeyer) #69

Isn’t that the same with pip? It can also happen (without conda) that pip install --upgrade numpy upgrades numpy to a version that is no longer compatible with the version of scipy you had.


(Tzu-ping Chung) #70

The difference is pip actually knows something is broken, only chooses not to stop you (only emits a warning). I believe backwards compatibility is one of the motivations behind this behaviour. Conda cannot do the same due to lack of metadata compatibility.


(Steve Dower) #71

I think it’s more that conda knows it’s broken and stops you, while pip has no idea because the specific build isn’t pinned, only a version range. (But every time this comes up it gets split into its own thread, so I’m not going to say any more - go read one of the other threads.)


(Travis E Oliphant) #72

You make a lot of good points. I could see that if python.org and packaging.python.org made it clear that people were choosing a particular approach to installing python when they download python.org and use only pip, that would be helpful.

I’m not sure it would convince all the package authors to not just tell people to “pip install” but perhaps what I should do is spend time convincing the “other ways to download and install Python” to also override “pip install” — given how prominent the notion of “pip install” is in every instruction set.

Although, even as I write it, I remember why we didn’t do that with conda (i.e. override pip) given that pip can be used for many workflows, I know it would be a recipe for maintenance nightmares. Of course, I suppose the replaced pip could just override the “install” command.

But, how do people feel about the idea I’ve heard Nick and others promote of using “python -m pip install” as the proper spelling of “install this package”.


(Brett Cannon) #73

"… because it’s better than pip install"? Or “and instead use ‘install this package’ to not over-promote pip”?


(Nathaniel J. Smith) #74

Please don’t. Ubuntu/Debian make some small tweaks to their version of pip compared to upstream, and it causes substantial confusion and problems, because users don’t know which version they’re using and when things go wrong no-one knows what’s going on or how to help. Replacing pip install entirely would be 10x worse.

Maybe it would be viable to have pip refuse to install into an environment that it knows is managed by some external package manager (like /usr on Linux, or a conda environment) unless some explicit override flag is passed?

Convincing package authors to change their instructions is a somewhat separate issue – the folks in this thread don’t have any direct control over what package authors put in their install instructions. And package authors are one of the audiences for whom pip has some substantial advantages over conda. Users who use pip install get exactly the package that the package author uploaded. As soon as they make a release, pip users immediately have access to their package. If users have problems, the package author can help them. If you want the users to try out a pre-release to see if it fixes their problem, that’s a heck of a lot easier when the user is using pip. If you add conda as an intermediary, this will in many cases make things better for the user, no question, but the benefits to the authors are much smaller and often negative. So… if you want to convince them to change their instructions, you might need to figure out how to fix that.


(Bernat Gabor) #75

I think we do. If we specify guidelines on https://packaging.python.org/ with a detailed explanation of why we suggest a given way I think package maintainers will adopt and follow.

Very dangerous because it would be backwards incompatible. For example, installing under this from within docker is fine.

I would hampion for pip having a mode to install some stuff as pipx does (virtualenv/isolated/user level) . This way all Python tools (black/flake8/cookiecutter/etc) should be recommended to be installed under this mode preserving the sanity of the global site-package.