Third try on editable installs

@bernatgabor It’s as @uranusjr mentioned. Here are the notes on the future of editable installs from the Packaging Summit at PyCon North America 2019. They were previously in this Google Doc section and I moved them to that packaging-problems issue comment.

There were also discussions in Pip 19.1 and installing in editable mode with pyproject.toml (mostly before the summit) and in Specification of editable installation - #40 by pganssle – that is a direct link to a comment where @pganssle indicates why he thinks there ought to be a proof of concept before we work to standardize stuff with a PEP.

Am also pinging @techalchemy since he was planning to be part of the effort as well, and @takluyver since Daniel mentioned Flit.

Currently @uranusjr @pradyunsg and @pf_moore are committed to spending a fair amount of time on the pip resolver project, which I am the project manager on. Tzu-Ping and Paul will be done with that commitment in probably June, but Pradyun will still have some committed time on this project till the end of December (not going to go into complicated “we have 2 funders and different timelines for different parts of the project” details now). So that’s going to affect their ability to volunteer to work on editable installs. If there’s a specific deadline someone is trying to meet to ship better editable installs, then it would be good to know that.

If anyone knows a company who would like to sponsor this work, please speak up and talk with them (see the entry about editable installs on the Fundable Packaging Projects page).

And if anyone would like to help write some grant proposals to get funding through the Packaging Working Group for a proper push on this work, I’d love that! Message me - I can help you get started. In about two months, for instance, the Chan Zuckerberg Initiative’s next Essential Open Source Software for Science funding opens for applications. And applications for Mozilla Open Source Support funding can be submitted at any time.

Here is the current specification that I am suggesting we build a PoC for: Specification of editable installation

Presumably with Bernat’s additional suggestion of adding a version number.

It’s pretty dead-simple, and I don’t know that it’s worth going into much more detail until we know what’s feasible and what problems could come up, but I think it’s the minimal specification that meets the requirements of:

  1. Back-ends don’t need to know anything about how to install files.
  2. Front-ends handle entry point generation and installation.
  3. Front-ends have the option of “installing” only the files that will be visible in a non-editable install.

There will certainly need to be more details fleshed out, but I still think that the next step is a proof of concept in pip and setuptools, since those are the most widely used tools and the tools most likely to run into backwards compatibility problems that will need to be addressed in the PEP.

3 Likes

Thanks @pganssle. I have a somewhat-theoretical1 interest in this subject, and might be willing to work on a pip POC. However, as @sumanah pointed out, I am committed to other work for a while yet, so it would have to wait until that’s done.

However, from my experience implementing PEP 517, I can say with some certainty that work like this goes far better if there’s a good backend implementation available before starting work on a frontend. So I think the initial priority would be a proof of concept in either setuptools, or another backend that’s simple enough to make adding tests into pip relatively straightforward.

1 i.e., I don’t use editables much, but I find the mechanisms for implementing them interesting in the abstract.

The older thread concludes it is not convenient to implement the new feature “#3 Front-ends have the option of “installing” only the files that will be visible in a non-editable install.” While features #1 and #2 are less likely to restrict how build systems work, just moving some of what setup.py develop already does into pip.

I should be able to produce my own POC.

When I did take a crack at it before, the major issue was that distutils (and by extension setuptools ) doesn’t have a clean separation between “figure out all the stuff that needs to go into the package” and “put the stuff into the package”, it just sort of assembles the package as it goes.

Do you have a link to this?

Can you detail on these?

Earlier in this thread

  1. Back-ends don’t need to know anything about how to install files.
  2. Front-ends handle entry point generation and installation.
  3. Front-ends have the option of “installing” only the files that will be visible in a non-editable install.

I meant why you think those statements apply to those points :smile:

I mean, it requires a refactoring, but one that I think is important anyway, because it would allow for the deprecation and removal of setup.py install. AFAIK the way wheel works is that it will just run setup.py install into a workspace location and then zip up the results.

If we refactor this now into a “build package manifest” step and an “install package from manifest” step, then both bdist_wheel and develop can invoke the “build package manifest” step.

In any case, I think in this situation it’s more important to get the hook right than it is to get it right now. Paul Moore and I are both busy working on higher priority things right now, but if you look in the thread, I’ve been suggesting a timeline that involves me trying to take a crack at it after the Python 3.9 feature freeze for months now.

You mean a PoC for setuptools or a PoC for something else? A setuptools PoC would, I think, be very useful. Creating a PoC for something like flit would be pretty easy and would make a pip PoC easier to do and so might be worth it, but I think we definitely do need a setuptools PoC before we can move forward here.

I’ll try to do a setuptools + pip protoype of a “simple” proposal in the next week.

1 Like

What does “a ‘simple’ proposal” mean in this case? The one outlined in my link, or one that does not meet all the requirements I mentioned?

You are welcome to do what you want, but I don’t want you to feel that I’ve misled you into thinking that I’m suggesting that all we need is a proof of concept for any proposal. I will definitely not be in favor of any proposal that does not include a mechanism for front-ends to include only the files that would be installed in a non-editable install (e.g. one where only folders are listed or something of that nature).

Of course I will implement one of my own proposals. This would solve the problem of “no hook for an editable install” without addressing the “editable installs aren’t 100% accurate” problem and without requiring a distutils refactor. Setuptools would be able to stop putting egg-links in site-packages because pip would take care of doing an equivalent job.

Flit’s implementation suggests that both kinds of editable installs can be useful. I would be happy to see a second or an extended hook for the “only the files that would be installed” feature as an option. I don’t think enscons or distutils will easily yield that feature. I understand if you feel that implementing the simpler option would kill the enhanced option.

2 Likes

From my POV: pth is way more predictable and easier to clean up without destroying your source directory (plus you don’t suffer from arbitrary code execution if you don’t put any arbitrary code in there! :grin: I generally do editable installs manually with a .pth file and/or special modules with __path__ overrides)

And the way to detect whether symlinks will work on Windows is… call os.symlink and handle the OSError. Hardly worth a library. (Virtualenv is more complex because launching an executable via a symlink isn’t always the same as launching a copy would be, and that’s their primary need. For this, it ought to be fine, if a little more risky when it comes to deletion.)

Prototype-quality code:

In setuptools we add a develop --no-install command that copies the existing develop command without the install bits. It returns { "src_root" : "." } (a relative path from pyproject.toml to where the .pth file should point. An absolute path would also work.)

Without the extra reformatting: https://github.com/pypa/setuptools/compare/master...dholth:redevelop?expand=1

In pip we go ahead and literally re-use the “install unpacked wheel” function. It already called the build backend’s “generate .dist-info metadata” hook. We call the new hook and put a .pth file.

The vendored pep517 is modified to add the new hook; it takes care of turning the relative path from the build backend into an absolute path.

On a second scan this kinda does what we agreed on with the difference, that allows to inject a single folder as is to the interpreter. There’s no way to allow filtering/merging/on the fly building of the files to expose in any way. Before we agree on this being a standard I’d like to address allowing those.

Yeah, this seems like a good start. The obvious example of something that needs to inject more than one folder is setuptools itself (which installs both setuptools and pkg_resources).

I have a use case where, for reasons, the source layout cannot match the installed layout. Typically in the source I have src/myplugin, and in target I want it to be installed in a specific namespace package, such as mytool/plugins/myplugin. A custom build backend can do that at build time, and it would be nice to have such capability for editable installs too.

1 Like

Here’s the commit that added the develop command to setuptools back in July 2005. https://github.com/pypa/setuptools/commit/e5eac13db392f851f15e014a1c20debb22da89b2 . It works about the same today but it also supports 2to3, a compiler that tries to convert Python 2 to Python 3. If you are using 2to3 it would point the .pth file at a build directory and you would have to rebuild to see your changes. pip install -e works by calling setup.py develop.

The prototype hook does the same thing as setuptools’ own develop command and is built with the old develop code. The difference is that pip creates the .pth file and copies the metadata into site-packages (instead of linking to metadata in the source checkout) and there is no .egg-link file. This would be less convenient if you had develop-installed your source into more than one environment and needed the metadata to update.

If this hook was used in setuptools it would return the parent directory of setuptools and pkg_resources. Both would be available after an editable install. Take a look in site-packages at the current develop-installed setuptools.egg-link and setuptools.pth or any other develop-installed package to see where it would point. Note it’s a file that adds a directory to sys.path, not an os symlink.

I’ve probably been using the setup.py develop feature since 2009. I can keep the one or two packages I’m actually developing in a checkout and update them that way. Including in production. Those packages may never be installed non-editable. We just want the dependencies to be installed and for our own package to be on sys.path. In the same way that a Django project may be used without setup.py with the difference that we have one.

If you use the develop command or pip install -e to prepare packages for pypi you can make the mistake of forgetting to test the installed package. You might depend on packages in the root of your checkout like setup.py (one reason a src/ directory is recommended) or other files that might be left out of the install or distribution. I’ve made a couple of broken releases this way, making a second release to fix the problem.

I think this is the motivation for wanting a tree-of-symlinks feature as an option for a new editable install feature. You would still occasionally make mistakes related to the difference between an editable and regular install, but you would be more likely to catch problems with setup.py’s error-prone MANIFEST.in.

This is a different feature than the add a .pth file strategy develop has used. The new feature could be useful if you intend to distribute your package and if you might forget to test the 100% installed package. It is not needed if you just want your checkout to be on sys.path.

The 2to3 support in the current version of the develop command offers a hint. The prototype hook doesn’t have to return the root of the checkout or ./src and pip does not know whether it did. pip could pass a prefer_inplace=False flag to suggest the build system produce a tree of files or symlinks, say, in a ./build/ directory, that gets added to sys.path.

I’m happy to try to make a PoC in flit - this would be about the simplest case, so it’s a good starting point to look at the question. But I’m still a bit hazy as to what exactly I should start implementing.

The proposal from last year appears to list every file in the package individually, which seems to be an extra complication for all the practical ways I can think of to implement editable install, where we arrange to add an entire package to sys.path.

As a concrete example, I expect that if I add a new submodule in my package, that’s available in an editable install without needing to re-run an install step. That breaks if each file is individually symlinked into a new directory.

4 Likes

Please enumerate when traditional setup.py develop falls short. This has been missing from the discussion. Are there some packages with a very strange layout that we should use as an example? When would using a src/ layout to separate setup.py / tests from the main files not be enough?

IMO the next relatively-convenient layer of complexity would be to (optionally?) include the equivalent of package_dir = { package_name : folder } and py_modules = [ … ] which is the one you always forget when you have a bare python file at the root. https://docs.python.org/3/distutils/setupscript.html#listing-whole-packages

How would the develop installer handle namespace packages? If there is only one ‘flit’ you could add a symlink to site-packages. But if someone comes along with a ‘flit.flap’ you would want to avoid having to revise the installation e.g. by putting a real ‘flit’ directory in site-packages and symlinking everything inside both develop-installed ‘flit’ directories to that directory. I would handle that by building a sensible per-develop-install tree somewhere and then putting a .pth file in site-packages.

If you want a real life project to try this on (which you probably don’t, at least not this one, but it is a real thing), the Azure SDK for Python and the Azure CLI projects use namespace packages extensively. And development is a pain as a result (you’ll see a mix of approaches due to people “fixing” problems and later realising their fix made it worse).

Built-in namespace packages should work fine with just sys.path additions, except for the times when they break completely. But that’s just how those things work - you (and everyone else) just have to be careful.