If Python started moving more code out of the stdlib and into PyPI packages, what technical mechanisms could packaging use to ease that transition?

njs · May 25, 2019, 8:06am

I did some more reading on how Ruby approached this, and it turns out they actually have 4 different levels, not just 3:

Classic stdlib: code that ships as “part of” the interpreter, no package metadata
What they call “default gems”: treated like 3rd party packages that are installed by default – they have metadata, versions, can be upgraded – BUT there’s a flag set saying that you can’t remove them, and if you try to pip uninstall then it just errors out.
What they call “bundled gems”: ditto, but without the magic flag, so you can uninstall them if you want.
Actual 3rd party packages you get from their version of PyPI

I put these in order because there’s a progression here: code doesn’t jump straight from (1) to (3), it goes through (2) as an intermediate step.

Agreed, and this seems especially sensible for any packages that we’re bundling. We don’t want to ship a package that doesn’t actually work

That said, I’ve seen a lot of projects that run their tests against Python master, which is exactly what you’re suggesting just done by different people, and it turns out to be kinda difficult. The tests often break because Python intentionally broke them (e.g.), and it takes some time for downstream folks to sort things out to adapt. Right now, for stdlib packages, you just adapt immediately in the same commit. I’m not sure there’s any general solution here. If we go down this road then sometimes there will be some extra work to coordinate, and we’ll just have to figure out strategies to deal with that.

Sure, today. If we did split up multiprocessing as an independent library and tell people that they can simply pip uninstall it and that’s a supported configuration, then it would be nice to make it a bit safer :-).

Also, there’s a very long history of projects that try to do this using just heuristics – it’s a major part of what tools like pyInstaller try to do. They’ve all ended up growing huge piles of special cases. Letting individual projects state authoritatively which modules they do and don’t need seems like a more scalable approach to me.