Nobody is following the metadata_directory promise in PEP 517

While exploring the possibility to add an interface to prepare_metadata_for_wheel in pypa/build, it was raised that no (as in none at all) PEP 517 backends actually fulfills the promise made in PEP 517:

build_wheel
If the build frontend has previously called prepare_metadata_for_build_wheel and depends on the wheel resulting from this call to have metadata matching this earlier call, then it should provide the path to the created .dist-info directory as the metadata_directory argument. If this argument is provided, then build_wheel MUST produce a wheel with identical metadata. The directory passed in by the build frontend MUST be identical to the directory created by prepare_metadata_for_build_wheel, including any unrecognized files it created.

But instead, all major PEP 517 backends choose to ignore metadata_directory, re-generate metadata from scratch in build_wheel instead, and make no effort whatsoever to verify the generated metadata is identical.

The issue has been raised to setuptools, flit, and poetry. Poetry devs did not respond to the issue (opened in May 2019). Setuptools actively generates non-identical metadata. Flit is the only project that’s responded. But it does have a point to not make use of the argument—It is easier to always re-generate, and since flit relies solely on static metadata, it does not expect anything to go wrong unless in extreme advanced and niche usages (custom backend extending Flit’s PEP 517 interface), so always verifying the output feels wasting.

So now we have a rule that nobody is following, and people using the interface (frontends and backend extensions) are left in a bad place. How can we improve the situation?

1 Like

Popularise the minor backends instead? :grin:

Seriously, it was one line. I don’t know what to do if they don’t want to type it. pymsbuild/_build.py at 2c31968d4576a388701f50e8117187afef767d37 · zooba/pymsbuild · GitHub

I consider these backend bugs that should be fixed by the maintainers. The frontends are free to raise an error and refuse to handle such backends.

2 Likes

I agree with just calling these backend bugs.

Frontends should work on the assumption that the backend follows the spec, and if that causes an issue, direct the user to yell at the backend. At a minimum, every backend should be able to compare the two sets of metadata and fail if they differ, so it’s not like this is complicated to implement. (Flit argues that it’s an unnecessary cost, I’d say that it’s only unnecessary as long as no-one can tell you’re not doing it :wink:).

Edit: Note that I don’t have a problem with backend bugs not getting fixed promptly. It’s a fairly rare edge case, and all of these projects are volunteer based, so prioritising more significant issues is entirely reasonable. But that doesn’t mean it’s not a bug…

1 Like

So it sounds like pypa/build (as a frontend) should stick to the standard and look for backends implement the expected behaviour in the future. I’ll bring this back to the discussion there, thanks.

Out of curiosity, what was the pypa/build issue that triggered the question?

Out of curiosity, what was the pypa/build issue that triggered the question?

TL;DR: me not knowing whether the metadata hook is an optional call for the frontend or not, and not seeing it in the build API. Thread linked up above by @bernatgabor - most of my chatter not related to the issue at hand.

Start from this comment instead: Support for metadata hook · Issue #130 · pypa/build · GitHub

It sounds like flit is following the spec? The spec doesn’t say the backend is required to actively check for equivalence, it just says that the output has to be equivalent, however that’s accomplished.

Otoh the frontend would be entirely within its rights to check and raise an error if there’s a difference. And the backend is free to check too, if they want. Might be a good idea for setuptools too, since it’s so hard for setuptools to know what user code and plugins are doing. The reason backends get the metadata_directory argument is to maximize their flexibility for implementing this rule, by reusing the old metadata, comparing against it, etc.

1 Like

Agreed. “The way that the backend works means that there can’t be a difference” is an entirely valid approach.

Yes, but the frontend is also entirely within its rights to assume the backend follows the spec, and not check. That’s basically the point of a constraint like this, it allows frontends to avoid checks that the backend did what they are required to do.

Yes, the frontend can check, and if they do, and find a discrepancy, they can give a friendlier error - but the error is still “the backend has a bug and I can’t proceed, please file a bug with the backend”…

Yes, the requirement is OK, but the solution is ‘generate the same metadata every time’ not ‘copy the passed in folder because the metadata might be different each time’.

Well, implementation detail. Some backend might be cheaper to compare and generate delta than regenerate it (e.g. where you need to compile some binary to find out the records :thinking:).