Continuing the discussion from Provide a way to signal an sdist isn't meant to be built?:
I tried to “do the work” and came up with a proposal. This is going to be long, because I’m trying to provide PEP-like levels of detail.
1. Gradual rollout for PEP 725 implementation
(For reference, that’s “Specifying external dependencies in pyproject.toml”.)
My idea is that in theory, maintainers will be following PEP 725 anyway. Rather than having project authors and maintainers make the choice about whether a project’s sdist is “too hard” for users to build (without knowing anything about those users), the goal is to give users (by default, and when using Pip interactively) the information up front about why the sdist might be hard to build (before it has a chance to fail).
Rather than immediately trying to solve the problem of turning dependency specifications like "virtual:compiler/c"
or "pkg:generic/freetype"
into an automatic lookup process, we should plan on just being able to map those specifications into friendly, human-readable descriptions for use within Pip etc. output (like “a C compiler” or “the freetype
library (try installing it with your system package manager)”).
That lays the groundwork for installers to present detailed prompts, so the user can check manually whether the requirements are met before attempting to build the sdist. (Users who don’t understand this information can be advised not to try.)
2. New options for Pip
The existing options for influencing the choice between wheels and sdists are IMO rather confusing. Options like --only-binary
and --no-binary
and --prefer-binary
seem like part of an enumeration of possible approaches, rather than orthogonal binary flags. They also don’t represent all the approaches that make sense, especially if the user is allowed to respond to new information during the process.
I seek to add one new approach for now, without ruling out the possibility of more alternatives being proposed in the future. Thus, I propose to add the following command line flag syntaxes, with an eye towards deprecating the aforementioned options:
--build-sdists STRATEGY
--build-sdist-for PACKAGES STRATEGY
Here, PACKAGES
is a comma-separated list of package names, as currently used with --no-binary
and --only-binary
options. (The :all:
and :none:
syntaxes seem unnecessary here and I would suggest not supporting them.)
For a given package, Pip would choose to use wheels or build sdists according to the STRATEGY
:
never
- only use wheels, and fail if no wheel is available (should be equivalent to--only-binary
)always
- only use sdists, and fail if no sdist is available (should be equivalent to--no-binary
)when-newest
- try to build the newest suitable version if it’s sdist-only; otherwise use the newest suitable wheel (the current default, as I understand it)when-needed
- use a wheel if possible, but build the newest sdist if there are no compatible wheels (should be equivalent to--prefer-binary
)ask-when-newest
- see belowask-when-needed
- see below - new, default behaviour
3. Prompting the user
When the ask-when-newest
strategy is selected and Pip is being used normally (i.e., without --no-input
), the user is prompted like:
The newest version of `foo` compatible with everything being installed is `1.2.3`.
There is no compatible wheel available for this version, but Pip could try to build it from source.
The package claims that building requires:
* a C compiler available as `gcc` on the command line
* the `freetype` library, which needs to be installed with
your system package manager or by following directions at
<url>
Please choose:
1. Try to build this package now
2. Stop all installation (and maybe try again later)
3. Look for an older version (including source distributions -
some of them could be easier to build locally)
4. Look for an older version, but only accept wheels
If the user chooses to look for an older version, including sdists, the prompt is repeated for each sdist found (until a wheel is found, building commences or the user cancels).
Similarly for ask-when-needed
:
Pip can't find any compatible wheels for `foo`, but could try to build version `1.2.3` from source.
The package claims that building requires:
* a C compiler available as `gcc` on the command line
* the `freetype` library, which needs to be installed with
your system package manager or by following directions at
<url>
Please choose:
1. Try to build this package now
2. Stop all installation (and maybe try again later)
3. Check the next most recent version, in case it's easier to build
4. For CI users
If the --no-input
option is provided, such that prompting isn’t possible, corresponding information should be logged. Under these conditions, Pip will try to build the package; so ask-when-newest
and ask-when-needed
are equivalent to when-newest
and when-needed
respectively.
A log message would be more technical - it could look something like:
Pip is attempting to build `foo==1.2.3` from source because a wheel couldn't be found that's compatible with both the platform and the arguments to Pip.
The package claims that building requires:
* a C compiler available as `gcc` on the command line
* the `freetype` library, which needs to be installed with
your system package manager or by following directions at
<url>
If building fails, please adjust your install scripts appropriately, e.g. by disallowing this version in your requirements specifiers.
To suppress this diagnostic in the future, please pass Pip an appropriate value for `--build-sdists STRATEGY` or `--build-sdist-for foo STRATEGY` as described in the Pip documentation.
5. Backwards compatibility concerns
For the purpose of prompting and logging, if no PEP 725 metadata is available, Pip should not assume that there are no special requirements to build the sdist, but instead that these requirements are unknown. (In a future where the specified dependencies can actually be fetched, of course, it would be reasonable for Pip to act as though there are no such dependencies, and allow the ancient legacy setup.py
to deliver the bad news.) If the PEP 725 metadata explicitly has empty values for build-requires
and host-requires
, of course, then there really aren’t any special requirements.
The practical, immediate effect is that if, say, foo
is a legacy project that uses setup.py
and only distributes an sdist despite being pure Python, the user will get a false-alarm warning that the project needs to be “built from source” with unknown system requirements. Users who disregard that warning will see the project install just as it did before, and that could well continue to be the case indefinitely.
If, on the other hand, foo
is one of these new giant AI libraries that depend on Torch and a bunch of other things, the same warning would be real, and disregarding it would lead to the same “help, what is subprocess-exited-with-error
” situation we have today - but at least the user was warned up front and got a decently clear explanation up front. And the packages would be able to add PEP 725 metadata to give a bit more clarity about what’s involved.
If it’s really desirable here to let the package authors give custom warnings here, I think that might be best implemented by extending PEP 725 to describe pseudo-dependencies (that are always considered “met” by whatever future resolver, but allow custom descriptions).