Can vendoring dependencies in a build be officially supported?

sirosen · December 22, 2023, 7:54pm

I think we should note that library authors can publish new, distinct, “v2” packages, to allow users to install their mutually incompatible versions side by side. It’s relatively rare though.

To stick with the pydantic example from above, in which v1 and v2 are incompatible. Imagine the following (fanciful) future:

pydantic reserves all pydantic* package names
pydantic declares itself “rename safe”, meaning all internal imports are relative and no features rely on explicit module names, etc
by default, users installing get a package named pydantic
a user can install pydantic<2;as_name=pydantic1, which installs the package under the name pydantic1

This would establish a future in which is possible to install the same package multiple times under different names. It’s interesting to think about and play with as an idea.

Is it a good idea? Does it solve the same problems as vendoring? To both, my answer is no. Probably it’s not a good idea at all. It works wonders for applications trying to use direct dependencies which their dependencies also use. It does little for library developers who want to be mutually compatible with one another, unless they are lucky or agree upon conventions in how they use it.

Downstream renaming has very different properties from it happening upstream, as a maintainer strategy.

Maybe there’s some useful kernel of an idea here. Renaming your package in a major version has benefits for the downstream consumers, but it’s seldom done even by the most mainstream Python packages with the biggest impact. Why is that? Names are sticky, but also renaming a package requires maintainers to revisit all sorts of infrastructure (e.g publishing pipelines). Should we work to better support and more strongly encourage a package publishing under different names for different versions?

As mentioned, the problems here are a mix of our technical constraints and the culture of Python developers.

In my own libraries, I avoid dependencies as much as I can justify. To a degree that’s healthy – avoiding unnecessary externalities and liabilities – but I think it’s currently necessary to a harmful degree. For example, imagine the ecosystem impact if one popular package, e.g. flask, internally used another popular package on a specific major version range, e.g pydantic>1. In practice, this means that a library developer has to be very cautious about pulling in dependencies, even in cases where an application developer would very definitely choose to include the dependency.

All in all, vendoring is a nice fix for the cases which really demand it, but it’s not the same as the upstream package making a decision to try to tackle these problems. I’d like people to keep thinking about how to make the diamond application dependency cases and library dependency cases better, perhaps centered around ways that packages can better support this for their consumers.