Are multiple .dist-info or .data directories allowed in a wheel?

Is it permitted for a wheel to contain multiple top-level directories that have a .dist-info extension? What if some of them are clearly not actual .dist-info directories, due to not being in the format {field1}-{field2}.dist-info (such as foo.dist-info, foo-1.0-bar.dist-info, or even just .dist-info with nothing before the extension)? How much of this carries over to .data directories as well?

This pip issue asserts that multiple .dist-info directories are not allowed due to a statement in PEP 376, though arguably the citation still does not prohibit extra .dist-info directories with names like foo.dist-info.

The reason I ask is that it would allow me to considerably simplify some code if I could assume that any top-level directory in a wheel with a .dist-info or .data extension was the .dist-info or .data directory, yet PEP 427 does not clearly address this situation. What do the experts say, and do we need to update the PEP?

I’d say that multiple .dist-info directories are not allowed. As you say, this isn’t explicit in the PEP, but I’m pretty sure it’s assumed in various places (pip, as you point out, appears to be one!) I also can’t think of any reasonable example of why anyone would want multiple .dist-info directories in one wheel.

I’d say that you would be fine assuming a single .dist-info directory - maybe document that your code makes that assumption, just for safety, but I think it’s fine to not support multiple .dist-info directories. I’d support a proposal to clarify PEP 427 to make it explicit.

It seems that pip does not allow multiple dist-info directories. I guess it is technically possible to allow multiple .dist-info directories in pip, but that would cause problems in other parts of the system (a wheel would then be able to override another package’s dist-info).

Not allowing multiple top-level .dist-info directories is the “spirit of the law” IMO.

OK, so one .dist-info per wheel, but what about .data? Pip seems to allow multiple .data directories in a single wheel [1], and it treats them all the same [2] (with apparently no provisions against the resulting trees conflicting with each other on install). Would it be acceptable for my code to assume .data uniqueness (thus allowing me to simplify some stuff), or does my code now have to be more complicated in order to keep track of multiple .data trees?

Honestly, I can’t even name one well-known package that uses .data from the top of my head, so there’s not really enough data for me to justify either way. The same spirit of the law applies here IMO. I think it would be “safe” for you to assume the same, in the sense that we can work on fixing the laws if anyone ever raises it as a problem.

1 Like

You have to know the filename to know which .dist-info directory you should be looking at. However I’m sure many tools, including ones I’ve written, will not work properly unless there is exactly one .dist-info.

In the past packages have installed extra metadata to work around the lack of a X Provides: y feature so there are reasons you might need to do it.

The .data question is more straightforward. If there is a second .data it should be installed as-is wherever the root of the wheel was installed. These won’t confuse tools.