Twitter thread re: Big Picture

Thanks for posting the twitter thread here. I for one, am still not used to the discuss.python.org site and don’t frequent it enough. I suspect Peter is in the same position. Though, I don’t work with Peter directly anymore at Anaconda, and so am not entirely certain.

I do know that we have interacted with many users of Python over several decades and have struggled through many packaging difficulties before developing conda as a general-purpose packaging solution (that we then used to package Python and R).

My main recommendation is that the PyPA should stop trying to make pip into a general package manger which it must ultimately be if it is to support all uses of Python. The PyPA should re-emphasize a more limited scope for pip than what is sometimes implied by users. The limitations of pip seemed to disseminate better in the past. I recall Nick Coghlan telling me that a user of pip is a “self-integrator”. When that term is explained, it is a reasonable description of the roles and responsibilities that someone using pip is taking on.

The problem I see is that most users of pip don’t realize that they are taking on the role of self-integrator and what the real consequences of that role are. They are also not provided guidance as to when they might want to use a more general (not language-specific) package manager to install their packages. In addition, people who build wheels often do things like vendor non-Python data into the package for initial convenience (which often causes later difficulties).

There are a few concrete things the PyPA could do:

  • strongly discourage “vendoring” of other non-Python libraries — with mechanisms for pip to detect when needed libraries are not installed (thus encouraging the installation of those libraries as outside the scope of pip).
  • provide an interface for general purpose packaging managers to integrate with pip meta-data.

There may be others as well. In general, users would benefit from a more limited scope for “pip install” and the subsequent interfaces and solutions and recommendations that would emerge.

The main problem I see is that because the PyPA has such visibility in the world, it’s recommendations define what the majority of people do — often to their own detriment as in the case of Tensorflow telling people to pip install wheels that “vendor” and install very particularly built binary libraries which cause problems with other packages the user then tries to install.

The conda-forge community can provide many more examples of challenges that ultimately cannot be solved unless pip takes on the role of a general-purpose packaging solution rather than a language-specific package manager.

7 Likes