Python Packaging Strategy Discussion - Part 1

steve.dower · January 6, 2023, 11:32am

This is more-or-less what Conda is, but the prevailing opinion (or perhaps just the loudest in this forum?) is that it needs to be torn apart to become more like the current “figure-it-out-yourself” ecosystem.

Worth expanding this survey - perhaps the easiest way is to ask Anaconda, who clearly saw enough value in distributing R packages to start doing it, but will also know why they haven’t seen the same success as they have with Python.

I’m working on this, just as I made the installer for 2.7 happen, but this time I’m trying to work with the compiler team rather than playing chicken Right now, the easiest way to get the compilers is through CI, which is usually free enough for most open source projects, and supported projects (i.e. anyone backed by a group like NumFOCUS) can easily arrange for more.

<sarcasm begins>Of course, the easier way to do this is to force everyone to switch to the same compiler. If we pick one that we can redistribute on all platforms, it’ll make things even easier for users, as nobody will have to use their system compiler or libraries anymore - they’ll get an entire toolchain as part of CPython that’s only useful for CPython and doesn’t integrate with anyone else’s libraries! (I hope the sarcasm is coming through, but just in case it’s not, this sounds like a massively user-hostile idea. But if you want to try this approach… well… it’s any of the Linux distros or Conda/Nix/etc.)

What I think is really the issue is that we haven’t defined our audiences well, so we keep trying to build solutions that Just Work™ for groups of people who aren’t even trying to do the same thing, let alone using the same workflow or environment/platform. @pradyunsg’s list of possible unifications above is heading in a much more useful direction, IMHO, and the overall move towards smaller, independent libraries sets us up to recombine the functionality in ways that will serve users better, but we still need to properly define who the users are supposed to be.

One concrete example that some of us have been throwing around for years: the difference between an “system integrator” and an “end user”. That is, the person who chooses the set of package versions and assembles an environment, and the person who runs the Python command. They don’t have to be the same person, and they’re pretty clearly totally distinct jobs, but we always seem to act as if end users must act like system integrators (i.e. know how to install compilers, define constraints, resolve conflicts, execute shell commands) whether they want to or not. And then we rile against system integrators who are trying to offer their users a usable experience (e.g. Red Hat, Debian, Anaconda) for not forcing their users to also be system integrators.^[1]

But maybe we don’t have to solve that end user problem - maybe we can solve one step further upstream and do things that help the system integrators, and then be even clearer that “normal” users^[2] should use someone else’s tools/distro (or we start our own distro, which is generally what system integrators end up doing anyway, because “random binary wheel off the internet” isn’t actually useful to them).

To end with a single, discussion-worthy question, and bearing in mind that we don’t just set the technology but also the culture of Python packaging: should we be trying to make each Python user be their own system integrator, supporting the existing integrators, or become the sole integrator ourselves?

If they want to be, they can go ahead and build stuff from source. But that’s totally optional/totally unavoidable, depending on what problem you need to solve. Still, non-integrator users are generally within their rights to say “I can’t solve that problem without this software, can you provide it for me” to their boss/supplier/etc. ↩︎
Those who just want to run Python, and not also be system integrators ↩︎