Discuss PEP 662: Editable installs via virtual wheels

ofek · June 9, 2021, 5:40pm

Thanks for your effort!

I have a concern about the example:

import sysconfig

import frontend_editables

path_mapping = ...  # Will have been returned by the backend.
installed_files = frontend_editables.install(
    sysconfig.get_path("purelib"),
    path_mapping,
    frontend_editables.EditableStrategy.lax,
)
# Then append the ``installed_files`` to the distribution's ``RECORD``,
# optionally by passing ``append_to_record=<path to RECORD>`` to ``install``.

The sysconfig.get_path("purelib") bit is misleading as the frontend will most likely not be running with the Python of the intended environment so you’ll actually need a subprocess call.

layday · June 9, 2021, 6:17pm

Yeah, that’s just a placeholder. In actual practice the frontend (pip) will pass the output path to frontend_editables.

bernatgabor · June 9, 2021, 7:05pm

I’ve started doing the pip+setuptools POC (that could use @layday library to do the path link) but it’s not yet ready:

However, I plan to use that as POC for the PEP.

layday · June 10, 2021, 11:46am

While Bernát works on setuptools and pip, I have added support for frontend editables in flit at flit@feat-frontend-editables, combining ideas that have been thrown around in this thread and in a way that strays (rather significantly) from the PEP. Specifically:

build_editable has been renamed build_wheel_for_editable.
build_wheel_for_editable builds an installable wheel.
The return value is the filename of the wheel. This wheel differs from
a regular wheel in two ways:
- It must not contain files and folders which it wishes to register
  as being editable in editable.json.
- In its metadata directory, it must contain one additional JSON file,
  editable.json, with the following schema:
```
{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "properties": {
    "paths": {
      "type": "object",
      "additionalProperties": {"type": "string"}
    }
  },
  "required": ["paths"]
}
```
  paths maps the paths of files the backend has omitted from the wheel
  to their absolute path on disk.
build_wheel_for_editable does not make reference to scheme paths. These are the
responsibility of the frontend performing the installation.
build_wheel_for_editable takes two arguments: wheel_directory and config_settings.
These have the same meaning as they do in build_wheel.
get_requires_for_build_editable is not implemented.
The build requirements are the same as for build_wheel and frontends
must call get_requires_for_build_wheel prior to calling build_wheel_for_editable.

This all gives us, in what you can experiment with today:

from pathlib import Path
import json
import sys
import sysconfig

import flit_core.buildapi
import frontend_editables
from installer import install as install_wheel
from installer.destinations import SchemeDictionaryDestination
from installer.sources import WheelFile


## BUILD AN EDITABLE WHEEL ##

editable_directory = Path() / "editable"
editable_directory.mkdir(exist_ok=True)
editable_wheel = flit_core.buildapi.build_wheel_for_editable(str(editable_directory))


## INSTALL THE WHEEL ##

destination = SchemeDictionaryDestination(
    sysconfig.get_paths(),
    interpreter=sys.executable,
    script_kind="posix",
)
with WheelFile.open(editable_directory / editable_wheel) as wheel:
    dist_info_dir = wheel.dist_info_dir
    install_wheel(source=wheel, destination=destination, additional_metadata={})


## INSTALL THE EDITABLE FILES AND UPDATE THE RECORD ##

# installer does not return the installation location, let's assume it's
# "purelib" for now.
root = Path(sysconfig.get_path("purelib"))
frontend_editables.install(
    root,
    json.loads((root / dist_info_dir / "editable.json").read_bytes()),
    frontend_editables.EditableStrategy.lax,
    append_to_record=root / dist_info_dir / "RECORD",
)

I’ve not addressed PEP 610, but assume that the frontend has to create a direct_url.json with a file URL and "dir_info": {"editable": true} at the end of all this.

bernatgabor · June 10, 2021, 11:54am

I’d strongly disagree with this approach (and my implementation also differs in this sense). The backend is not generating a JSON. It returns the content. The frontend might decide to use a JSON file to communicate with the backend, but there’s no reason to mandate that file, it’s fine to use any type of inter-process communication technique. This is in line with how get_requires_for_x works. Similarly, there’s no need for build_wheel_for_editable to take the wheel directory argument.

I again strongly disagree. To achieve an editable mode the backend might take additional dependencies, and as such we should not conflate wheel dependencies with editable dependencies. The backend can alias those to the same if it wishes, but should be allowed to differ.

layday · June 10, 2021, 12:12pm

My implementanion rests on producing a PEP 660-like wheel for simplicity of installation and interoperability with existing tools. This is mainly to assess the viability of frontend_editables - not that of the PEP as a whole. The backend could return something like a two-tuple of (wheel_filename, extra_paths) rather than create an editable.json in the wheel; I don’t think it matters too much but it means that the editable installation has to occur in the same execution cycle. Any kind of IPC with the backend other than to request an editable wheel has been omitted on purpose. What this comes down to is our previous disagreement over whether the backend should be able to influence the editable installation. The PEP can go in a different direction - please don’t take my implementation to be normative.

layday · June 10, 2021, 9:18pm

One complication that became apparent when I tried to interface with @pf_moore’s editables is that the frontend editable library might require additional dependencies to be installed in the target environment, at which point it stops being a simple post-installer kind of thing that the frontend can call and forget. Of course, the editable library could bundle its dependencies and e.g. make a copy on install, but it’s a limitation to consider.

pf_moore · June 10, 2021, 9:31pm

editables simply needs to be declared as a dependency of the editable wheel in a PEP 660 world, and the front end’s normal dependency resolution mechanisms will handle it. But with the virtual wheel approach, it seems like any additional dependencies in support of the mechanism being used will have to be installed by a separate dedicated mechanism (as the installation isn’t being done via the standard “install a wheel” route).

FWIW, I have no plans to bundle dependencies in editables, or make the runtime parts available as anything other than a standard wheel.

layday · June 10, 2021, 10:05pm

To clarify, I meant that my library could bundle editables, not the other way around. Sorry for the confusion.

bernatgabor · June 11, 2021, 11:07am

Incorporated some of the feedback TBD: Editable installs by gaborbernat · Pull Request #1977 · python/peps · GitHub, let me know if I might have missed some.

layday · June 11, 2021, 12:20pm

Thanks. I appreciate all the effort that’s gone into the PEP though I still feel like it tries to do too much in a way that’s a bit vague.

I am concerned with the complexity of installation of the virtual wheel. For this PEP, a parallel installation process will have to be developed. In my mind, it’d make more sense for PEP 662 to piggyback on the wheel standard to perform the initial installation and operate as a post-install kind of hook for the editable part of the installation (i.e. as demo’ed in frontend-editables). I would also like to see the scheme paths abolished and the editable installation restricted to purelib and platlib; the library location would be derived from the wheel as normal. In general, I would prefer that the PEP would focus on providing a seamless experience for your typical editable installation rather than try to make everything under the installation prefix “editable”; but my reservations might come from my own inexperience in this area and what might seem complex to me might actually be very simple in practice.

I look forward to the pip PoC.

PS. I’ve added support for Paul’s editables mechanism in frontend-editables, completing the “editable trifecta”.

bernatgabor · June 11, 2021, 1:11pm

Not really. pip already contains logic to install wheels. The only change needed is to read the distribution information not from a wheel file but instead straight from a dist-info. In practice, this results in only needing to skip the extract/read from the zip phase of the wheel. All other mechanisms can and should be reused.

In my mind, this is an implementation detail for the frontend and there’s no need to mandate it. Frontends are free to zip up the dist info returned by the backend to create a wheel and feed that to their wheel installer. Or just alter their wheel installation logic to take in not just a wheel file and look into a dist-info folder in that, but also a take the dist-info folder directly. They’ll need to alter their wheel installation either way though to support the scheme mapping so adding support for reading from dist-info folder is not that hard.

You’re free to create a competing PEP that does so, however, this PEP does not and will not take that angle. I want to offer the option for frontends to do more than just purelib/platlib. Basically anything that’s possible to install via a wheel, should be allowed to be doable via a virtual wheel, as detailed in:

We refer to this set of information as the virtual wheel. This virtual wheel
should contain all information a wheel contains, however it's not zipped and
its installation will not be done by copying the files.

This is so that we don’t have to create another PEP in 6 months to support data files, and then include files and scripts and so on.

I don’t think we as a community are in an agreement as what’s typical. Is it typical to support python files, or inline C-extension, or data files? Is it typical to auto-discover new files for the project? This PEP aims to not make that decision, and leave it up to the frontend to decide how much it wants to support; and how it achieves that. Granted this might mean that different frontends on different platforms might support a different subset of editables. Or that frontends might offer different variations of typical editable installation, and let the user choose based on their needs and cons they’re willing to live with. However, I think that’s fine. Editable is only meant to be used by the developers of the project, so their target user group is smaller. And I’ve added into the PEP that a frontend that cannot satisfy the requirement of the backend should raise an error to the user and explain why not. At which point the user might alter its project code or choose a different frontend.

layday · June 11, 2021, 1:38pm

No part of wheel installs can be reused; there is no structure which is returned by the backend which resembles a wheel. In any case, there’s very little for the frontend to install with PEP 662, if we assume that the editable installation is delegated to a helper library. That library will then have to replicate the entire installation process of a wheel, from a different source. This is obviously a not insignificant undertaking, and the output of the helper library might differ in subtle but significant ways from that of, say, pip.

How to go about installing a distribution is not an implementation detail.

I think we are all in perfect agreement that exposing data files and scripts as editable is novel. We’re also in agreement that it’s infeasible if done statically, unless through symlinking, which we are also in agreement is a platform-dependent solution.

bernatgabor · June 11, 2021, 2:01pm

I think you’re mistaken. They might be novel for setuptools, but with flit you can achieve this as it symlinks the root folder. Granted is platform-dependent, but that’s an existing way that IMHO should be still possible post standardization. Also, even though they’re novel I think they’re valid use cases that we should offer support given we’re trying to introduce a new standard.

I strongly believe it’s not for us to determine. There are many ways to do it. The frontend through the end user should choose what’s best.

A wheel is made up of a .dist-info folder, a .data section + root, see PEP 427 – The Wheel Binary Package Format 1.0 | peps.python.org. This maps in the case of a virtual wheel to the metadata_for_build_editable and the schemas key. The .dist-info part can be handled the same way for a normal and a virtual wheel, and as such can be reused. The .data + root part needs to move from a copy to something else (pth, import hook, symlinks, etc). So you see you can reuse the .dist-info part of wheel install, and then only handle separately the .data+root part.

dholth · June 11, 2021, 3:32pm

Notionally the wheel spec is made up of layers “pack”, “unpack”, “spread” but the installer is not supposed to take it literally, pip did actually unzip it into a separate temporary directory for a long time before copying it into place. The abstract model gives each file a category + a path. In the model the .data/purelib directory is the same as the root directory if root-is-purelib for example.

When implementing an installer you should probably skip unpacking the zip to a temporary directory. Instead, check if each path in the zip matches any expected prefix inside the .data directory “does it start with package-1.0.data/purelib”? and if it is none of these it is in the root category; replace the prefix with the category’s installed path e.g. site-packages; and extract the file directly to the target location.

(Some people appreciate that wheel works this way instead of a more straightforward PREFIX/path for all files like the older bdist_wininst format. Most files will be under the empty ROOT prefix. This also makes the archive smaller.)

(Is the .dist-info directory in its own category, or is it in PURELIB or PLATLIB depending on the metadata?)

If we put relative symlinks in this model and they linked to other files in the wheel, then you would need to adjust the targets of symlinks between categories. A symlink between purelib and scripts would need to be rewritten in a similar way to how we determine the installed locations of each file in the wheel.

The more tricky part would be deciding whether it was a security problem to let the wheel link to any file on the filesystem. Of course you’d allow absolute symlinks to anywhere in a virtual wheel for editable since the point would be to symlink to your source directory.

Of course figuring out what to do with the symlink on Windows would be a big concern. For the Linux case where we currently copy shared libraries to give them a .1 and a .1.0 suffix the installer could correctly create a copy or a hard link instead of a symlink - but you wouldn’t be installing that kind of wheel on Windows. Windows users would worry about receiving needlessly incompatible code just because it happened to contain a link.

steve.dower · June 11, 2021, 9:56pm

The earlier point here (to me, at least) answers the later point. Symlinks would have to be metadata that are created by the installer as part of the “spread”, as normal unpacking is not going to allow for remapping the target.

Given that, we can have symlink_or_copy, symlink_or_fail and symlink_or_ignore metadata, which forces package developers to specify the fallback behaviour (and hopefully reconsider whether they need a symlink at all, now that they know they can’t rely on them).

Assuming that the install is going to copy all the listed files into their usual location, sure, but that’s not the point here. The point is for the frontend to figure out how to expose these files in their current locations (or more precisely, how to ensure that changes to the original files are automatically reflected on next launch) - a straight copy would be the same as a normal, non-editable install.

The other proposal is the one where the frontend can just do a regular install, whether the wheel needs extracting or not, because the backend has decided how to handle the editable side of things.

layday · June 12, 2021, 9:41am

Can we retitle the thread and provide a link to the PEP at the top now that it’s been merged?

I’m not mistaken, flit doesn’t symlink data files or scripts. It symlinks the one package folder in site packages.

I’m not sure what to say to this. Some operations will be similar - that doesn’t mean you’re able to use the wheel infrastructure to perform them. There’s no wheel to extract files from, the structure is not that of a wheel and wheel-specific metadata don’t exist or are ignored. Even if cp routines existed in pip, you’d not be able to use them because you’re not copying files. Then there’s console scripts which are covered by a different PEP, and which you’ve said you’d like to provide a wrapper for; and possibly other things that I’m overlooking. You are going to have to devise a new installation process from the bottom up.

Returning to an earlier point, it’s still the case that it is difficult for the frontend to make a determination on the strictness of installation because the PEP does not require that the backend maps files to be installed and not folders. Does the frontend have to check each entry that it isn’t a folder, and if it is, refuse to perform a stricter/narrower installation? Or is it that if there’s a folder the frontend assumes that all of its contents as they appear on disk would’ve been included in the wheel? If it’s the latter, why not just have the backend find every file in the folder recursively and pass them on to the frontend? This was a concern for several people and the PEP should address it in some form; though I imagine it is difficult to do so with the amount of freedom that’s afforded to the backend.

dholth · June 16, 2021, 10:44pm

One way to reduce the freedom in this idea might be to include a mapping between categories and source directories, and then only relative paths to each file. This would preserve the spirit of limited rearrangement encoded in wheel.

platlib: { base: /absolute/path/to/src,
files: [ ‘mylib/’, ‘mylib/init.py’, ‘mylib/extension.so’ ] } …

layday · June 16, 2021, 11:13pm

With freedom I mean that the backend’s able to return a subset of the paths that’d go in the wheel, that it can mix files and folders and that it can perform its own editable pre…installation and return a mapping which does not include any of the actual source files. Being asked to map a file or folder to a different name is not that big of a deal. It introduces very little complexity in comparison.

dholth · June 16, 2021, 11:26pm

I was concerned that it would be really easy to produce a mapping that could not be achieved with a simple .pth file, and that you would be reduced to creating a tree of symlinks or a fancy import hook to produce the desired shape. So you might always have to use the “difficult” implementation option.