Yeah, I didn’t want to go too deep into this part, since it really is the topic and not the background. But now this is a separate post…
I think this is where the issue just becomes worse for Conda compared to distros, in part because of the scenarios, but also because the Linux distros have partially mitigated things.
Basically, it starts off worse because Conda’s scenarios are “packages that are hard to compile in isolation”, while Linux distros tend to be “packages that we need for other tools in our repo”. Notably, the latter implies that users shouldn’t know or care about those packages, whereas Conda is providing packages specifically to satisfy a user’s request - it has a stronger obligation to provide everything. If Conda’s users were more “just give me a working environment and I don’t care what’s in there” then they’d be less demanding, and would run into less issues trying to force things into a particular shape.
The other aspect is that Linux distros already separate out their own packages from user-managed ones, whether with user site packages, dist-packages
, or some other way. They then have ways to ignore the ones the user provided (e.g. python -s
to ignore user site packages) when running their apps that rely solely on packages provided by their repository. (Personally I’d have set this up with a dedicated directory for these packages that is not on sys.path
by default, and manually add it in when running a script, e.g. with PYTHONPATH
, or some directory relative to the script that is automatically detected by the runtime.)
Conda is not this scenario at all. All of its packages are for the user to use, so there’s no sensible separation of concerns. Anything installed by Conda somehow needs to merge with anything installed by pip to form a single environment providing everything the user needs. (Maybe this is going back too far, but if you recall bdist_exe
, then anything installed by one of those needed to merge with anything added by easy_install
in much the same way.)
Right now, to my knowledge, there’s no equivalent scenario out there. Conda+pip is quite unique, because nobody else is really trying to automatically merge two different package repositories like this. The standard is very much to use a single tool for a single repository for each “project”/“environment”, because this is way more manageable. Conda-Forge makes this way more feasible for many Conda users, because now they have a repository that is way more likely to satisfy the packages they need without having to resort to merging two different repositories.
What I’ve not seen written down before is an actual spec from the Conda side of how they would like pip to behave in the presence of conflicts when merging repositories. I’m fairly certain such a spec could be implemented without any actual special cases for Conda, but without spelling out what behaviour they want, none of us have a chance of providing it. (And it gives something concrete for devs to argue over and push back on.)
Even more importantly, without that spec, none of us have a chance of communicating to users what to expect. Some users expect to “conda create” an environment and have pip handle everything, while others expect conda to handle everything. There’s no clear line, which means there will always be users whose expectations cannot be met, and so there’ll be complaints.
tl;dr: pip/Conda is the hardest interop problem we - and likely anyone - faces in packaging, in large part because nobody has defined how they should interact.