Should there be a new standard for installing arbitrary data files?

This is a feature that has been requested periodically. I think it would be a great idea to replace the feature.

Long ago virtualenv and/or setuptools/pkg_resources broke data_files more or less by accident. Maybe it wasn’t supported very well in the .egg format. And with virtualenv putting your files in different environments there’s no longer a consistent root directory for your data_files, so your program can’t find them at runtime. Eventually you can be tempted to make the argument that this thing we broke on accident was a bad idea.

One version of packaging proposed that every file in a distribution could be relocatable. A manifest would map each individual file to its install location. In wheel we compromise by relocating categories of files but we don’t record where each category was installed.

Suppose we allow the packager to define their own wheel categories with variable substitution, we made something like the automake standard directory variables available, then

"a" : "${docdir}/x" in a categories definition
+
package-1.0.data/a/README.txt in the wheel
could be installed to
<root of environment>/share/doc/package-1.0/x/README.txt

The packager has a way to express intent, and the person doing the installing still controls exactly where the files are installed.

The usual objection is that the limited feature set of post-virtualenv Python packaging makes it easier to use free libraries. I think you’d get new kinds of software that would make up for it.

2 Likes