PEP 517 and projects that can't install via wheels

pf_moore · January 30, 2019, 1:10pm

PEP 517 has no support for direct installs from source, with everything being built as a wheel and installed from that. That’s a deliberate design decision. However, pip introduced the --no-binary option specifically to allow for packages that cannot be installed via wheel (see this comment and this issue).

I don’t know if such “known bad” packages still exist, and I’m pretty sure that in fact the majority of current use of --no-binary is focused on the “I don’t want to download prebuilt binaries, I want to install everything from source” use case. But we need to consider what we do in a post-PEP 517 world.

At the moment, --no-binary disables PEP 517, but that’s not a long-term solution as we intend (at some point) to remove the legacy "install via setup.py" code path from pip.

I personally think we should simply drop support for packages that won’t work when installed via wheels (specifically from pip, and from consideration in terms of current and future standards discussions) but we need to decide how to publicise that desupport, and how to identify and warn affected projects.

What do other people think?

More generally, should we have a “packaging standards scope” document that clearly states assumptions like this over how we expect supported projects to work?

dstufft · January 30, 2019, 1:32pm

I think the future is that we always install via wheel, but --no-binary controls whether or not we will download an existing wheel or not.

takluyver · January 30, 2019, 1:38pm

I think it’s be important to make the message clear: pip will still be able to install from sdists, but it will do so by building a wheel locally and installing that. This has been the default way to handle sdists for some time, but the --no-binary option currently overrides it.

Does anyone recall any specific packages that had problems with building wheels when they were first introduced? It would be informative to check if those problems have since been fixed (either in the affected packages, or in packaging machinery). If there are some that still can’t build a wheel, pip might need some kind of legacy install mode to handle them.

pf_moore · January 30, 2019, 1:46pm

Agreed, but we have to publicise that better, as it’s a backward incompatible change and I’m getting tired of dealing with (legitimate) complaints that we broke people’s packages without warning

We have that legacy mode at the moment - setup.py installs are (at least in my mind ) legacy since PEP 517 support was released. The problem is that we can’t support that (or any other) legacy mode indefinitely, so the question is when (and how) we remove it.

pganssle · January 30, 2019, 2:43pm

I took a quick look at those issues but I don’t see an explanation of where there are packages that cannot be installed via wheel. Certainly there will be cases where you must build from sdist (e.g. no wheel available for the relevant environment, etc), but what kinds of projects can be installed from an sdist but not build into a wheel?

If it’s just a matter of some projects having designed their installation workflow in such a way that it breaks if you install via wheel, then I think saying, “Everything must be wheel compatible after a certain date” is fair. If there are legitimate cases where something about the wheel format doesn’t capture someone’s use cases, that’s a harder problem.

dstufft · January 30, 2019, 2:51pm

My several year old memory at this point is that there were some projects that didn’t install correctly when they went through a wheel (typically because they had some sort of logic in their setup.py that did something wrong when you went through a wheel) and some of those projects changed themselves to raise an error if you tried to build a wheel from them. This would trigger pip to always fall back to the setup.py install case.

As far as how do we get rid of it, I think the same way we get rid of anything. Deprecate it, raise a warning for a period of time, and then kill it. I don’t think there’s anything special here except figuring out the right time to deprecate it and how long it should remain deprecated until it’s killed. Certainly we should probably wait a few releases of PEP 517 capable pip to make sure that the issues and shortcomings of that get solved first.

ncoghlan · January 30, 2019, 2:54pm

The projects that don’t currently work right when installed via wheel file are the ones that have target-environment-dependent logic in their install phase.

Some projects get away with it because what matters is that the wheel build environment is sufficiently similar to the install environment (i.e. same Python version, CPU architecture, operating system, etc), so installing via a locally built wheel file still works.

Other projects are actually using setup.py to implement post-install hooks, and those can just flatout break (because the hooks either don’t run at all, or they run in the wrong environment).

pganssle · January 30, 2019, 2:56pm

Is it not the case that when you do pip install on an sdist, the wheel is built in the environment you’re about to install it in? If so, all the logic can be configured correctly at build time and the correct wheel will be produced.

dstufft · January 30, 2019, 3:00pm

A (crappy) example is that a lot of Twisted libraries have a post install hook that populates a per environment cache. IIUI they’ve long since worked around that, but that sort of thing is something that just isn’t possible to deal with in the current setup.

To be clear, I suspect all of the projects that do this could work around it in some way, with some amount of effort, and likely a lot of them have at this point. So I don’t think it is a huge deal if we look towards only going via wheels.

ncoghlan · January 30, 2019, 3:11pm

While I can’t find a clear reference for it, one of the other cases I recall coming up is projects that were registering Windows COM objects in a post-install step, and various other venv-unfriendly things.

So it may be that our answer to those kinds of things is to suggest that folks look at the briefcase project (which creates native installers from setuptools based projects), since package level installation formats do support post-install hooks (at the expense of not typically supporting parallel installation of multiple versions of the same package).

pf_moore · January 30, 2019, 3:43pm

Absolutely. The original implementation of --no-binary was long enough ago that I suspect that most projects have fixed the motivating issues. The worst case now is likely to be the odd unmaintained project that someone else has as a dependency.

Deprecate then remove is fine with me, I just don’t know how we’d necessarily spot the cases which would fail after the removal, so that we can issue the warning (just warning on every use of --no-binary is too broad).

dstufft · January 30, 2019, 4:02pm

We could rework the --no-binary behavior to attempt to build an emphereal wheel and install that, and failing that fall back to setup.py install (basically how the fall back behavior worked at first). We would still not use a binary wheel from PyPI, we’d just shuffle it through a wheel as part of the install process. We could then warn anytime we had to do the setup.py install fallback.

gpshead · January 30, 2019, 7:08pm

Why should this be a warning? I maintain a package that intentionally does not have a wheel because any version of manylinux is so ancient that it would would defeat the purpose of the package and deliver a poor user experience. There are other legitimate cases where this could come up as well (requiring a modern compiler toolchain and modern hardware interface libraries never make sense as part of a manylinux build environment for example).

Not having a wheel should never be flagged as a bad thing for a package. (ie: no package shaming)

takluyver · January 30, 2019, 7:30pm

Thanks @gpshead for proving my point about clear messaging.

The plan is absolutely not to complain about packages that don’t have public wheels on PyPI. You’ll still be free to publish packages with only an sdist. What we want to change is that installing those packages will work by building a wheel using compilers etc. on the user’s system, and then install from that. With setup.py packaging, that means running setup.py bdist_wheel in the install process. This is already what pip tries to do, but there’s a fallback to running setup.py install if it fails. The discussion is about deprecating and removing that fallback.

We believe that very few packages would be affected by this. If your package is a counterexample (setup.py install works but setup.py bdist_wheel doesn’t), please share details!

gpshead · January 30, 2019, 7:57pm

ah, yeah, a local setup.py bdist_wheel should be fine. I was conflating that with install.

barry · February 4, 2019, 9:55pm

Not just install phase. Some packages have environmental factors that are not captured in the build artifacts, so they are not possible to build centrally. We see this problem with numpy and friends.

pganssle · February 4, 2019, 10:52pm

Not sure I understand this: numpy and friends do build and ship wheels.

That said, this is not a question of projects that can’t ship wheels - it’s definitely true that those will exist even if only for weird platforms. The question is things that can’t build wheels as a part of the install.

Currently the pip workflow for installing an sdist goes (roughly) like this: download sdist, prepare build environment, create a wheel (python setup.py bdist_wheel or via the build_wheel hook), install the wheel.

I believe that --no-binary :all: skips the “create a wheel/install a wheel” step and instead invokes something like python setup.py install on the source distribution, installing the package directly without creating an intermediate wheel. The question is whether there’s some deficiency in the wheel format that would prevent creating a build artifact given that you know exactly what environment you are targeting.

barry · February 4, 2019, 11:23pm

Sorry, let me explain in a little more detail, and we may find that there is no problem. Corporate hat on here.

We have many internal “variants”, i.e. platform OS, and other factors. We cannot consume PyPI wheels for legal reasons (but that’s boring and not technical), but we also can’t trust non-pure Python wheels built externally because there are environmental leakages that aren’t captured in the wheel tags and which do not align with our internal infrastructure. Because we don’t know things like, what environment variables or compiler options were used, binary built wheels aren’t useful to us.

Since we import our packages from PyPI to an internal mirror, it would be conceivable to build the wheels internally at the time we import them, but that’s problematic because of the above variant environmental factors. On developer machines, we do plan on building the wheels locally, which we know must match. Once we have better control over variant builds, we’ll be able to do this at PyPI import time and thus have a single cache supporting all machines.

What we really want is “never download a wheel” which doesn’t sound like it aligns with --no-binary :all:. I think it’s fine to build the wheel locally and install it instead of sdist installation. I just need to prevent consumption of externally built wheels.

dstufft · February 4, 2019, 11:58pm

Yea there’s just a bit of confusion. The question isn’t about producing shared wheels, but projects that cannot ever pass through a “built as a wheel” phase, even if that wheel is 100% ephemeral and lives only for the lifetime of the pip process.

Early on when pip started producing wheels by default and installing them instead of directly invoking setup.py install, some projects either ran into issues and subsequently blocked the bdist_wheel command or simply never supported it. That was many years ago at this point, and we’re trying to figure out if there are still projects like that, and if so why can’t they be installed via a wheel file.

takluyver · February 16, 2019, 1:11pm

For the record, I did recently come across one project (streamlink) that was talking about using --no-binary to install without ever going through a wheel. But the motivation was not any limitation of wheels; for complex reasons they wanted the installation on Windows to generate the foo-script.py launcher scripts, whereas pip now appends the script to the foo.exe launcher when you install from a wheel.

This is the kind of thing I meant by ‘to work around odd packaging bugs’, although this one is more exploiting differences in packaging implementation - the new behaviour is not a bug.

They can probably achieve their aims easier by adding a gui_scripts entry point and allowing regular installation, so hopefully this case shouldn’t concern us.