PEP 660: Editable installs for PEP-517 style build backends

pganssle · May 26, 2021, 4:54pm

I mean, both create an editable install, but “Backend gives a dist, like a wheel, and provides the list of files” means “backend tells the frontend where the files it wants instlaled are, on disk”. There’s another bullet that says “The front end will decide how to expose this - TBD, .pth, something else?”. That is the crux of the difference between the virtual wheel proposal and this one. In this one, setuptools could decide to give you a wheel full of symlinks, or a .pth file or an ad hoc proxy — the backend is deciding how the actual files get exposed to the frontend.

In the “virtual wheel” proposal, the important difference is that the backend gets the list of all the stuff that is supposed to go in an editable package and hands it to the front-end, which does some unspecified thing to expose it to the front-end. That could be a .pth file or creating a proxy module or a daemon that watches all the files and re-installs them automatically or a bunch of symlinks or some combination of these things.

If we were to make the requirement “backends SHOULD provide a full list of files to be installed, as identical as possible to the list of files to be installed in a wheel” rather than replacing that with a MUST, then an implementation of the setuptools backend that is equivalent to what PEP 660 would give us is probably no harder than an implementation of PEP 660 (maybe easier, actually), because setuptools can just return the list of things it would expose with its .pth file. If we’re slapping together hacks, we may as well just have the hack be that setuptools has a crappy implementation of the editable install protocol.

bernatgabor · May 26, 2021, 4:57pm

I don’t personally agree with this. IMHO we should not hold the entire packaging ecosystem hostage of the setuptools technical debt. In my view would be a good place to end where setuptools is used less, and other backends can rise instead to replace it slowly. Realistically only @jaraco is working on setuptools, and even he less and less exactly because of its technical baggage, and making any change in there is hard. Driving through big changes in that codebase without breaking half the world is almost impossible.

pganssle · May 26, 2021, 4:57pm

I don’t think is is quite accurate, since at least @FFY00 seems to agree with me on this.

Also, would everyone still be OK with this “you can’t configure the install mode” idea if setuptools decides (well within its rights for this PEP) to go with the “strict” implementation, where you must re-run you pip install -e if new .py files are added to the project?

It’s not just people who want strictness who are going to get caught up in the problem of “your backend tool decides the behavior of editable installs for every user of the project”.

pganssle · May 26, 2021, 5:01pm

Yeah, this is not necessarily or entirely due to setuptools’ choices. setuptools is just the place where you go to get support for everything that a Python package can be. It’s deliberately taken on the distutils baggage. By saying, “The whole ecosystem is being held back by seutptools”, you’re saying, "We don’t want to implement these features for “edge cases” like people with C extensions or any of the numerous situations where setuptools is the only reasonably well-supported thing.

I’m saying that, realistically, the ecosystem won’t move with you right now unless you have features that work with setuptools, and a correlate of this is that you can introduce changes into wide swathes of the ecosystem just by implementing them in setuptools. It sucks that it’s difficult, but piling a bunch of path dependent hacks on top of one another is kinda how we got into this mess in the first place.

steve.dower · May 26, 2021, 5:02pm

I haven’t gotten to implementing it yet, but one of my plans for pymsbuild's editable support was to automatically rebuild out-of-date extension modules on import.

Perhaps all the front-ends will figure out how to generate that importer for me? Seems like it’s better for my backend to provide all the logic needed here.

(Though at the same time, I’m not at all afraid of saying “use the backend directly during dev and support PEP 517 for your sdists”, so I’m not too worried about the potential outcome here. It would be nice if frontends are able to invoke it through a common command, but not critical from my POV.)

bernatgabor · May 26, 2021, 5:09pm

While currently only backend supporting c-extensions may be setuptools this will not be always the case. Steve already indicated his backend is planning for this support (or already does?), and I’d imagine other backends would follow eventually. Or people doing often C-extensions would join forces and implement the accepted PEPs for setuptools (that project really needs more contributors).

pganssle · May 26, 2021, 7:35pm

The virtual wheel approach I’m suggesting would allow someone to build a front-end (not all of them need to do it), which would generate an importer for everything that runs build_editable_wheel immediately prior to importing anything that was included in the virtual wheel. If your backend is optimized for incremental builds, this might be more or less useful, and it could be that you create an msbuild frontend that does this and only works with msbuild as the backend, if you want to do this.

It’s actually kinda tricky to do what you’re suggesting in PEP 660, because at import time, generally speaking, you wouldn’t be executing in the PEP 518 isolated build environment. You as the backend can’t even guarantee that that isolated environment still exists. You’d have to vendor your build backend into the application or make all your build-time dependencies also install-time dependencies. The front-end can make guarantees about the lifetime of the build environment and it works via subprocesses anyway, so it should actually be able to execute that importer more cleanly than the backend, and in a backend-independent way.

This is another case where the choice of how the files are exposed to the system is going to vary a lot per person more than per project (though in this case it’ll also vary per project). Most people won’t want a full build step as part of an import because of the latency, particularly if they are largely implementing changes in pure Python parts of the code. You’d very much want a mechanism to say, “I want a no-rebuild install this time” vs “I want a rebuild-on-import install this time”, rather than saying, “This is a rebuild-on-import project”.

That said, for incremental builds and such, it is the case that the backend has information about input files that won’t show up in a virtual wheel, so any sort of thing that says, “Rebuild only if X has changed”, or writing something that watches all relevant input files is not necessarily possible with a virtual wheel mechanism. We could always add hints about input files to later versions of the virtual wheel spec, though, and in the meantime there are ways to get something like this (e.g. watching all the files in the whole directory and triggering rebuilds, etc).

pf_moore · May 26, 2021, 8:03pm

I don’t think that’s a major problem, actually - because pip already has editable support for setuptools, PEP 660 is much more important for other backends in the short term. Yes, it’s important that setuptools implements the standard at some point, but mostly so we can drop the legacy workaround than because users are suffering from the lack.

Oh I see. Thank you for being patient here, I’d not understood that difference at all. I do think that’s a reasonable counter-proposal, and something that should be written up. Clearly, it would be a “rejected proposal” in PEP 660, and it could be a separate PEP of its own as well.

I won’t comment right now on my preference, as I have a hard enough time keeping comments I make around standards processes separate from my personal opinions without mixing the two in the same message, but I will say that someone needs to have the time and enthusiasm to champion that model if it’s to go anywhere (and I totally understand that you don’t have the time to be that person).

steve.dower · May 27, 2021, 12:20am

That’s a good point, and one that is going to affect things far more widely.

Personally, I don’t want an isolated build environment for my editable packages, because I’m in my dev environment for that package. It’s only when installing stuff that I’m not trying to work on that I’d want its build dependencies kept separate.

However, if the frontends go ahead and use an isolated environment for this (which the PEP doesn’t appear to mention, but presumably the recommendation from PEP 517 still applies), then no incremental builds will be possible. That seems like a deal breaker (and there are plenty of relevant build steps besides native modules - even file copies are worth avoiding).

All that said, I don’t think the intent of the current proposal really breaks anything. It asks the backend to provide a wheel that could be installed as normal, and then the frontend is able to do whatever kind of install it likes. There seems to be one statement in the PEP that forbids them from doing this, but that’s really not enforceable anyway and should probably be changed:

Old text: “Frontends must install editable wheels in the same way as regular wheels. This also means uninstallation of editables does not require any special treatment.”

Proposal:

Frontends should be able to treat the editable wheel in the same way as they treat regular wheels. In particular, neither Python nor other frontends should need to distinguish the installed package from a regular one.

The rest of that section can stay binding if people want it that way, but I really don’t see any value in forcing all frontends to do a straight extract.

dholth · May 27, 2021, 12:58am

Could you possibly want build isolation for editable wheels? May be unavoidable to address in the PEP. The pep says

May do an in-place build of the distribution as a side effect so that any extension modules or other built artifacts are ready to be used.

Allowing the hook to opt to extend the search path only. You could run the compiler yourself as often as you like during development without rerunning pip install -e .

With enscons this looks like pip install -e . then scons (like make).

Build isolation is different than unpacking the archive somewhere else and building it? (In which case the editable wheel would refer to a temporary directory?)

sbidoul · May 27, 2021, 7:23am

I don’t think this should be a should ? The intent is to say that frontends must treat the wheel as a regular wheel, and must not attempt to infer anything about its content, beyond the fact that it follows the wheel specification.

Regarding build isolation, does PEP 517 actually mandate build isolation ? pip, for instance, does have a mechanism to disable build isolation. Currently, PEP 660 (and PEP 517 AFAICT) says what must be available in the build environment, but does not say the environment must be isolated. So I tend to think build isolation is a UX question for frontends, and does not need to be mentioned in the PEP.

gbdlin · May 27, 2021, 2:54pm

I see an issue with the approach of passing the exact and finite list of files to the frontend/installer. When user wants to add a new submodule to his package, already installed in the editable mode, this submodule won’t be automatically picked up if the build frontend or installer purely relies on a finite list of files, passed from the build backend when the package was installed in the editable mode. This will require user to reinstall the package each time he wants to change the file structure of his package.

I think the correct approach should require a reinstall only if the configuration of the build backend changes (like requirements, location of the root module, anything that ends up in package metadata). Instead of a simple list of files, either the location of the root module of the package should be provided by the build backend or something that can be dynamically recalculated on each import resolution/code execution (like a glob pattern).

But in general, I agree with the approach. The build frontend or installer should be responsible of choosing the method to expose the module, not the backend.

pganssle · May 27, 2021, 3:38pm

The solution to this in the original discussion was that if your front-end wants to automatically pick up all new files in a directory, it can mimic the behavior of the current setup.py develop and add the root directories of each included module in the path (this is the .pth file approach).

It won’t be perfect, because it’s possible that you have some crazy backend where it’s configured to take modules from multiple directories and when you run pip install -e . one of the directories is already empty, but honestly, pip install -e is never going to be perfect and there’s already no good solution for when you have to rebuild extension modules, so it seems fine to ignore those rare edge cases.

I suppose? I don’t know that this is a generally good idea, but this is another reason to have the front-end control the installation mechanism. In PEP 660 as written, you’d either need to vendor msbuild, inject msbuild into the build dependencies only in the build_wheel_for_editable or require that any front-end conflate the build and development environments. With a “front-end chooses” mechanism, you can use a front-end (msbuild can provide that front-end!) that manages a long-lived build environment somewhere and always activates it automatically as part of the import, all within spec and without doing anything extra weird. You can also have a front-end that always combines the build and development environments (though this may lead to conflicts in some packages).

I don’t think that this is true. I think it may be conflating the idea of isolated builds with the idea of building in a temporary directory rather than “in tree”. Isolated builds just means that there’s a virtual environment with only your build dependencies in it somewhere and the frontend will execute builds in that environment. That shouldn’t preclude incremental builds because you are presumably not maintaining state in site-packages or anything specific to the Python environment.

pf_moore · May 27, 2021, 3:51pm

Thinking about this, I see a couple of issues:

If the backend says exactly what gets exposed, then any change (for example, adding a new .py file) needs a rebuild. I guess “exactly what is exposed” could mean something like a glob pattern, but at that point we’re getting quite far from a “virtual wheel”.
It would be very hard to do this in a way that doesn’t require the frontend to have an intimate knowledge of the project layout. Different approaches have different trade-offs (no solution that we’ve yet found is perfect) and you can’t know the best trade-off without looking at the project structure.

Of course, we don’t have to achieve perfection. The simple .pth approach setuptools currently uses has been good enough for years. But if we are willing to accept that, what’s so hard about setuptools just building a wheel with essentially the same .pth files it’s always installed? (I’m uncomfortable saying “surely it’s easy” about any project I’m not involved with, so my apologies if there genuinely is some significant complexity I’m missing here, but I still don’t see how it can be more complex than implementing the whole “virtual wheel” mechanism).

bernatgabor · May 27, 2021, 3:53pm

This assumes a daemon mode for the front-end of which I’m not a big fan of personally, so I’d advise against that. I’d prefer the backend to do the work, on-demand when someone loads a given module. Also, I don’t agree we can’t get a perfect solution. pth is flawed by design, but we can come up with a perfect solution, and I think the current PEP proposal achieves this by moving the heavy lifting onto the backend rather than onto the frontend. I don’t agree we should ignore these edge cases.

pganssle · May 27, 2021, 4:16pm

If the list of files to be included in the output changes, the project should require a rebuild. One of the big selling points of the src layout is that it makes testing your projects as installed much easier, and it’s harder to accidentally have stuff on your path that you don’t want. I was once preparing a talk about why you should prefer the src/ layout and a big selling point was that for all the people who like using the flat layout because their project is on the Python path, you can get the best of both worlds with pip install -e ., but was unpleasantly surprised by the fact that pip install -e . just basically creates an environment where you are back in the bad old days.

The current .pth logic basically just takes the list of files to install and adds the directory containing each module into sys.path. A front-end can recreate this easily for people who want new files to show up automagically, and for people who don’t mind executing pip install -e every once in a while but do want to get something very close to what they’re going to get in the final version, it can use something that only exposes the files given to it by the build backend.

And again consider, this should be the front-end’s choice, because which behavior you want has nothing to do with the nature of the project or the backend, it’s entirely dependent on what the front-end user wants. At the beginning of a project, I may not care about strictness, but I’ll be adding lots of new files quickly, so I want the loose behavior. When the project matures, I’m basically never adding new modules, but I don’t want some stray .pyc file or something getting caught up in my sys.path causing me no end to trouble (or accidentally relying on something explicitly excluded by my build rules).

The virtual wheel approach allows for this flexibility, the PEP 660 model doesn’t.

I don’t see that this is true. Both the editables and .pth file approaches can be implemented just as easily by a front-end with a list of files as they can be by the backend. The only difference is in weird edge cases already not addressed by the setuptools’s .pth file approach, like a situation where someone has a build rule that would include files from a directory which currently includes no files. Almost no one uses multiple top-level input directories, and for the subset that do, adding new top-level input directories is exceedingly rare. Also, the failure mode is, “This failed to import” and “try re-installing because I did something weird” and whenever that happens I re-install to make sure I am not in some weird state anyway. It’s pretty discoverable what the problem is and easy to correct.

One of the major problems with this is that the “virtual wheel” approach is approximately as simple as the PEP 660 approach, but with much more room for improvement. PEP 660 codifies a very specific approach to editable installs and makes variation on that theme with different trade-offs very hard to achieve.

I am frankly shocked at the opposition to the “virtual wheel” idea, since it seems to me to be very close to strictly better than the current proposal, since it can achieve everything PEP 660 can achieve and it unlocks the ecosystem for a lot more UI improvements rather than locking in the old behavior.

The only thing that I recall being more difficult with the original “virtual wheel” approach, which mandated a full list of all files in the virtual wheel, was that for setuptools specifically, it’s very hard to get the actual list of input files to be included in a wheel (with all the exclusions done right). But if we relax this constraint to allow setuptools to return only the directories containing the files (which is the current status quo anyway), that difference disappears.

pganssle · May 27, 2021, 4:28pm

Why do you think it does? I think daemons are very complicated to do right and I don’t think anyone will be running them right away anyway, so none of my thinking assumes a daemon mode. Feel free to assume that unless otherwise specified, I have ignored the possibility of someone using a daemon.

Here’s how it looks:

User: Frontend, please give me a “loose” editable install of /basepath.
Frontend: Backend, please give me a virtual wheel.
Backend: Here is a virtual wheel, it contains “/basepath/mod/__init__.py, /basepath/mod/subpath.py and /basepath/mod/subpath2/__init__.py”
Frontend: Since this is a loose editable install, I will create a .pth file that adds /basepath/mod to sys.path
OR
Frontend: Since this a loose editable install, I will install a proxy module that exposes the name mod and always looks in basepath/mod for missing names at import time.
OR
Frontend: Since this is a loose editable install and I can detect that I am on a platform that allows symlinks, I will create a symlink to /basepath/mod in site-packages.

For the “strict” install:

User: Frontend, please give me a “strict” editable install of /basepath.
Frontend: Backend, please give me a virtual wheel.
Backend: Here is a virtual wheel, it contains “/basepath/mod/__init__.py, /basepath/mod/subpath.py and /basepath/mod/subpath2/__init__.py”
Frontend: Since this a strict editable install, I will install a proxy module that exposes the name mod and always looks in basepath/mod for the subset of names specified in the virtual wheel. I can throw a special ImportError if the file exists but wasn’t included in the original manifest.
OR
Frontend: Since this is a strict editable install and I can detect that I am on a platform that allows symlinks, I will create a directory symlink of symlinks in site-packages/mod-..., one for each of the original files.
OR
Frontend: I am on a platform that doesn’t support symlinks, but I only know how to do strict editable installs using symlinks, so I will throw an error suggesting that the user use a “loose” editable install.

The spec explicitly allows front-ends to make these and other choices.

Like I said, most of the interesting trade-offs here are things that vary by developer workflow and environment (is this being built into an IDE vs. in a terminal? Is this on a weird file system or platform), it makes the most sense to have your front-end(s) doing this work.

steve.dower · May 27, 2021, 4:31pm

Wait, where does it do this? I must have missed it.

My reading (and the reading of the discussion leading to it) is that the only thing codified is that the files have to be made available in a wheel file. It says nothing about the format of those files, what they contain, etc. They can all be symlinks (on OS’s that support them), or they can be proxy modules, or a .pth file, or import hooks, or anything at all provided it can go into a Python environment and has enough metadata for other frontends to recognise them.

But why? If it knows how it’s been built, why can’t it be more clever? It already has to avoid caching, which is divergent from how some front ends (those that cache, which I assume is not a universal requirement), so this can’t be a strong requirement without a long tail of exceptions.

This document is about defining boundaries, not a design spec for implementers. Frontends and backends will still have to design their own implementations, and those that use designs that don’t work just won’t get users. We don’t have to prevent them by specification.

pganssle · May 27, 2021, 4:44pm

The approach to editable installs it codifies is the paradigm where all configuration of the installation scheme goes in the backend. It makes user-based configuration and variation in the front-end very hard to achieve.

Back-ends are free to have all kinds of weird installation modes, though practically speaking I imagine everyone will do some version of the “loose” installation (though I’ll fight for setuptools using “strict” if it comes down to that).

Consider that this introduces intra-project conflict where none needs to exist. “I like this backend” “But it uses symlinks in its approach to editable installs, and I develop on a platform where that doesn’t work well.” → This is a conflict that could be avoided if this choice were left to front-ends, since me using pip install --strict-editable . doesn’t prevent you from using pip install -e ..

I am hard pressed to come up with any downsides to this division of labor, just upsides. Theoretically it may actually even be strictly better if we allow for the possibility that some backend could simply choose to build whatever PEP 660 wheel it was going to include in its wheel in a directory somewhere and pass that as the path for the front-end (though I think this would be a very bad idea), unless we explicitly say it’s not supposed to do that.

bernatgabor · May 27, 2021, 5:14pm

One downside of this is that now the frontend needs to do the heavy lifting. Currently this implies pip. PEP-517 does not manages minimum requirement on the frontend, so IMHO will be harder to automatically provision the required frontend version to support editable wheels. When the backend does the lift the user can just specify the backend min version in pyproject.toml and use current frontend stack. The frontend just needs to install a wheel and not know about various forms of editable installs. I guess the question is would pip maintainers be interested in implementing all three of those modes you’ve described (pth, loose editable, symlinks). Pushing this on the backend offers more flexibility in practice for the users because currently it’s harder to swap out your backend than frontend. The options on the front end side are very limited and provisioning min version of the frontend is not yet a solved issue. Also in your proposed solution we actually need three different entry point extensions?