How to predict name of wheel without actually generating it?

To bring wheel generation under CMake control, I would need a simple and safe way to predict the name of the wheel that will be generated by pip wheel.

Or, if no such tool exists, how difficult would it be for you maintainers to implement, say, pip wheel --dry-run.

The file names follow a clear convention: Binary distribution format - Python Packaging User Guide

Whether that is predicable depends on what exactly you are doing. For an arbitrary package it isn’t predicable, and am not even sure if pip can figure it out without an interface in the backend, which AFIAK doesn’t exists, i.e. dry-run would have to tell the backend to do a dry-run, which isn’t possible.

1 Like

Sad but plausible.

What if I am restricting my question to a single backend, namely setuptools?

I don’t think the answer changes, but maybe setuptools exposes some kind of legacy interface somewhere, but I doubt it.

If you are ok with restricting yourself to a specific set of packages/platforms, you can reconstruct the relevant parts of the process ofcourse.

If it’s for your own wheel, then presumably you know what PyPI package name the wheel should use, and what platform you’re building it for (i.e. the wheel tags)… ?

We want to automatize our build process. We know of course the wheels we are generating today. We also know the rules by which the names are supposed to change when the Python minor is incremented. But we also know how brittle the Python world is. We are just afraid to hard-code a complicated name that may soon be broken by the next clever innovation.

Perhaps you could tell us more specifically what are you trying to do, because it’s hard to find a fitting solution with the scraps of information you’re giving us. So far I understand that there’s CMake and setuptools involved somewhere, and apparently pip wheel.

1 Like

Agree with more specific information, but we can be pretty confident that the format of the wheel name will not change (and also that most build backends should get it right - I suspect the only variations here will be capitalisation/normalisation of the package name).

What is going to vary are the values of the ABI and platform tags. The format does not change, but obviously cp314 is going to be in there one day (somewhat predictable) and potentially manylinux_2_99 (unpredictable).

The choice of these values is going to depend on the package being built, rather than the build backend. So you really have to multiply out by every potential package if you want to predict its name ahead of time. Anywhere in its build process it may determine how narrowly its platform has to be specified.

(As an aside, I could see the value in the builder being able to force backends to use the most specific platform tag they can - or more likely a completely custom one. But that’s not a thing that exists right now.)

Chances are your best bet is to just use wildcards, and either warn or fail if multiple files happen to match.

Hold on - so the current state of things is that build backends just implement their own logic for that? Based on… just whatever?

Based on the specifications, of course :slightly_smiling_face:

Only the build backend can determine what compatibility tags are appropriate, for example, because whether a C extension is built into the wheel is controlled by backend-specific information.

1 Like

But I mean, the backend is responsible for figuring out how widely or narrowly usable a given built C extension would be?

Who else could be? I’m not sure what you’re trying to suggest here. Some tool has to decide where a given wheel will work. What tool has a better chance of doing this than the one that built it?

Based on metadata provided by the author, of course. ABI3 vs. a specific version vs. a specific CPython version are something the code author has to know and ought to inform the build backend about.

There’s no reason to formalise or specify this any more than “it’s between the author and the backend, and here’s the format of the final wheel filename”.

1 Like

Aha, this is what I’m getting at. I was worried that the backend was supposed to be somehow responsible for knowing how a given compiler works to the extent of “passing it these flags produces something that works on that architecture”.

So, presumably a solution for OP’s problem involves parsing that metadata? Although, of course, it isn’t standardized…

Yes, if it even is metadata and not some other form of inferencing. Which is why I suggested that the solution is to handle each package individually (if a prediction is needed) or use a wildcard.

What I’m trying to understand is, what other kinds of inference are actually possible here.

With the current restrictions enforced by PyPI, not a lot. Mostly they’ll be down to cross-compiling, where the version of Python being launched doesn’t necessarily mean that the package being built will target that same version/architecture/platform.

If you aren’t planning to publish to PyPI, your platform tag can be completely arbitrary, which opens it up to all sorts of things. You could embed a CUDA version in there, and provided you manage to add it to the list of tags that pip will choose to install (via some hacky patch that I’m sure Paul will completely disown :wink: ) you can then have additional compatibility constraints that nobody has ever thought of yet.

1 Like

Well, currently it’s not even metadata. Whether setuptools builds an ABI3 extension, for example, is determined by the value of the py_limited_api argument to the Extension function. And it’s not clear to me how it determines a set of wheel tags from that, as a wheel could contain multiple extensions, built with different settings…

Of course things could be different. But they aren’t, right now, and as far as I know no-one is working on anything like this.

2 Likes

These days I have the luxury of saying “even if you can get pip to work with this, uv will reject it” :slightly_smiling_face:

3 Likes