Preventing unwanted attempts to build sdists

pf_moore · May 31, 2024, 2:22pm

Here’s a very simple proposal.

Add a new metadata field, let’s call it No-SDist-Build for want of a better name (it needs a better name, because that one sucks ).
Expose that in the simple index, the same way that Requires-Python is.
Front ends will not select the sdist if that metadata item is present.
Front ends should have an option to ignore the metadata item and always allow sdists.

This is backward compatible (older frontends will see the sdist and use it, ignoring the new metadata).

We could give a helpful message by making the metadata item a string that can be presented to the user explaining why they need to opt in if they want to use the sdist.

rgommers · May 31, 2024, 2:57pm

I like it @pf_moore. That’s pretty close I think. Could I suggest an amendment to (4) and an extra (5):

[…] Consider the name --allow-sdist for that option.
If the option from (4) is used, the front end must pass a config_settings entry with key allow-sdist and value True to the back end.

For (4): I’m aware UX decisions are preferred to not be in standards, but as usual it’s much better if tools choose a common name (the differences between pip and build are still painful today). So a recommendation would be useful.

For (5): this can be introduced in a backwards compatible way, since No-SDist-Build is an explicit opt-in, as long as packages take care not to start using it before the build backend they use has support for that key. It’s important to pass the “allow-sdist” decision by the user on, so the build can make decisions based on that like I described for numpy.

pf_moore · May 31, 2024, 4:04pm

I don’t like this, because for most projects, sdists will be allowed always. So it’s misleading in the general case.

I’m fine with a recommendation for a common name, but it should be very clear that it is only a recommendation, not a requirement. Put it in a “discussion” section rather than the “specification” section, and avoid using formal terms like SHOULD or MAY, and that would be OK with me. But I do think that we should continue to avoid dictating UI in standards.

Good luck getting all build backends to agree on a standard config setting. That’s a can of worms we’ve carefully avoided opening until now, and honestly, I’d advise keeping away from it if you want this proposal to get anywhere. But if you want to try, that’s fine - just be prepared for it to be a lot of work. For example, I don’t think that at the moment we even require build backends to ignore unrecognised keys, so --allow-sdist :all: could break builds.

IMO, we should keep the proposal as simple as we can get away with and a config setting doesn’t fit with that principle.

alex_Gaynor · May 31, 2024, 6:24pm

Is the expectation that individual projects would be responsible for choosing whether to set this, or would it be a global policy (e.g., “if you have a native extension module, this gets set”)?

I ask because as the maintainer of a very popular package with an extension module, I’m torn between two things:

We get a lot of bug reports that are basically “it failed to install, but the correct fix is to ignore the error message and go get a wheel”. These users plausibly would be helped by this. (Assuming the error message provides sufficient direction to guide them to upgrading pip or whatever.)
We get a lot of bug reports that are basically “it failed to install, but you’re on a weirdo platform so you have to actually debug the installation failure”. These users would be negatively impacted by this, since there is no official wheel for their platforms.

Presumably there are some number of users in bucket #2 who debug problems for themselves. What I don’t want to happen is to get more bucket #2 bug-filers, because they don’t understand the pip (or other installer) options on their platform.

rgommers · May 31, 2024, 8:54pm

I need to give this some thought and am out of time for the next few days due to work deadlines. At the moment I don’t yet see how helpful this can be without the backend knowing anything at all - deciding between the production and development defaults isn’t possible then I think. And having separate frontend and backend flags for effectively the same thing is not pretty:

pip install numpy --allow-sdist -Ccustom-backend-arg

pf_moore · May 31, 2024, 9:21pm

It’s set by the individual project - they add it in their pyproject.toml. If they don’t add it, nothing changes from how things work right now.

One problem I just thought of - if you add this new flag in (say) numpy 2.0, then pip install numpy on a platform with no wheels will see no numpy 2.0 sdist, and therefore select an older version which does have a sdist that isn’t flagged as “not for installing”. That’s almost certainly not what we’d want to happen.

On the other hand, we can’t simply fail because there’s no numpy 2.0 wheel and the sdist is marked as “do not install”, as there might be compatible wheels for an older numpy version, and the user would likely want those to be installed.

ntessore · May 31, 2024, 10:06pm

Would it make sense to complement the explicit flag with a simple heuristic in pip such as “a package with binary wheels will not install an sdist unless asked to”? I can imagine this being accompanied with a clear message such as

Package xyz does not provide a binary package for your platform. To try and install from its source distribution, run pip install --some-flag.

pf_moore · May 31, 2024, 10:10pm

That was suggested on the original pip issue. I don’t recall the conclusion we came to, but I suggest you read the discussions on there to get the background.

kknechtel · June 1, 2024, 2:23am

Discourage-Difficult-Build?

--attempt-difficult-builds?

I’m not sure I see a reason why the reason meaningfully differs from project to project. Or at least, if it does, the problem is more complex than ought to be explained in a terminal window, so I’d rather have a URL for a documentation sub-page.

I’m not really following. Why are users in the first group running into the problem at all? Isn’t Pip supposed to prefer the wheel for them by default anyway?

mdrissi · June 1, 2024, 3:49am

Because reasons why wheel aren’t chosen can be difficult to follow. One small example is mac version. How do mac version numbers work? Which versions are compatible with each other? Did you know that major version numbering for mac changed a few years ago? And some random environment variable for macs can change which version is reported? So two macs may install same os version but report inconsistent things because of relatively buried setting. Details like this matter for wheel selection and I’ve helped debug environments that had obscure settings mis configured.

Also detection of some these settings depend on pip/packaging version. Because macs decided to change their versioning schema, older pips that weren’t aware of new scheme rules will consider some things not compatible that are fine. This is very reasonable of pip. Basic facts of what does version number mean and what is os version has surprising footguns.

groodt · June 1, 2024, 5:23am

Suggestions for the simple Metadata field.

Name bike sheds:

Requires-Sdist-Optin
Enable-Sdist-Build

I wonder if we should consider marker / tags evaluation on the values as well as regular True/False? e.g sys_platform=='linux’

I think it can be rolled out to new front ends in a compatible way?
Add the behaviour with a default value of True, but print a clear warning that the behaviour will change.

Then after 6 months, flip the default to False in new frontend versions.

alex_Gaynor · June 1, 2024, 5:04pm

Running an old version of pip is by far the most common reason.

kknechtel · June 1, 2024, 5:12pm

… Just how old would Pip have to be for that to happen?

alex_Gaynor · June 1, 2024, 8:25pm

It depends. Support for different wheel tags has been added at different times – e.g., a musllinux wheel requires a relatively recent pip.

ncoghlan · June 28, 2024, 2:33am

This topic came up again in the thread discussing build variants, and was split out to a new thread here: Provide a way to signal an sdist isn't meant to be built?

barry · June 29, 2024, 12:30am

That’s a lot of broken packages with pip install --only-binary :all:. I wonder if they get many bug reports because of that. These numbers also have interesting relevance for that being the default.