PEP 735: Dependency Groups in pyproject.toml

jamestwebber · November 24, 2023, 7:15pm

Do you feel like this PEP would create a barrier to a proper solution in the future? I guess I’m wondering if you object to the proposal or to its application in that scenario.

It would seem like the two proposals could be compatible in a future implementation, but I’ll admit it I didn’t read the linked issue very closely.

sirosen · November 24, 2023, 8:30pm

I’ve just finished a major chunk of revisions which reintroduce the object format for handling paths, and I’ve tried to be rather precise, including explicitly stating when behaviors are undefined or left to implementers’ discretion.
I still haven’t gotten to the appendices yet, although I did spec out a set of three with “TODO” for their content.

The current spec draft is probably easiest to quickly grasp by looking at the example config:

[dependency-groups]
test = ["pytest", "coverage", {path = ".", editable = true}]
docs = ["sphinx", "sphinx-rtd-theme"]
typing = ["mypy", "types-requests", {path = ".", extras = ["types"], only_deps = true}]
typing-test = [{include = "typing"}, {include = "test"}, "useful-types"]

[project.optional-dependencies]
types = ["typing-extensions"]

pf_moore · November 24, 2023, 8:38pm

I know that control of installing stuff in editable mode matters in some people’s workflows, but to me it feels very out of place here. It’s about how you install things, whereas what’s being specified here is what you install. I’d love it if someone could explain to me a scenario where it’s essential to be able to say “install <something> in editable mode” rather than just “install <something>”. And by “essential” I mean “it matters that everyone uses editable mode” not just “I personally always prefer editable mode here”.

I think the distinction between “what is needed for correctness” and “what I prefer for my personal workflow” is an important one, and standards should focus on the former, leaving the latter to be part of the UI of individual tools (as tools themselves are often a personal workflow choice).

sirosen · November 24, 2023, 9:28pm

I have a scenario I know of in which being able to specify editable is essential for efficiency, but not necessarily correctness. I think that counts.

Suppose I have a tox testing workflow which tests two related packages in tandem, from a monorepo.
Repo structure is

+ foo/
  - ...
  - pyproject.toml
  - tox.ini
+ bar/
  - ...
  - pyproject.toml

and foo depends on bar.

foo/tox.ini specifies some big matrix build, something like the equivalent tox r -e py312,py311,py312-foo,py311-foo,py310,.... It’s runnable as tox r or tox p, and developers are doing that constantly as their testing workflow.

foo/pyproject.toml wants to specify (for this tox build to use/consume) that an editable install of bar should be used. That way, developers in this monorepo can iterate and run this matrix-style testsuite with the assurance that bar will only be installed once per tox environment.
({path = "../bar", editable = true})

Without using an editable install, every pull of the repo could make tox reinstall bar into the environments it uses for these matrix builds. Using editable for the dependency keeps things fast in the common case. Occasionally, bar’s metadata will change in a way which requires a fresh install, at which point tox may need to be instructed to recreate the environments, as in tox run --recreate.
(Aside, as of tox v4, a lot of cases for changes to package metadata forcing a rebuild are handled out of the box. I think there might still be some gaps? I’m not certain about when explicit recreation is necessary offhand.)

There’s a similar case for a library which wants an editable install of the current repo ({path = ".", editable = true}), but I think the monorepo case is what makes it most compelling that this should be a generic capability of Path Dependencies, rather than some tool-driven flag for doing pip install -e ..

mdrissi · November 24, 2023, 10:10pm

Currently my team works on codebase that several other teams also commonly work on. Many of people working on that codebase may only contribute occasionally. They want easy onboarding experience to contribute.

To handle dev environment right now we have a small bash script. At it’s core bash script is,

check venv is activated
pip install requirements.txt
pip install local_requirements.txt
pre-commit install

The first requirements.txt is a lock file (pip compile one). The second local requirements is install each package in repo (monorepo) as editable. Not installing as editable is easy footgun for a lot of people I work with. It’s common for them to change some code in their own folder and expect that change to work.

A big part of having standard setup script is make contributing easier for people where python is not their primary language or they just may be less familiar with python packaging. I’d be surprised if I asked a lot of people who work on my codebase if they have good sense of what editable even means. But it being there for them already means they can just edit files as they need and think of it as running simple script. Similarly setup script forces venv of some flavor as I’ve seen too many times lack of one leading to confusion later.

It is possible for someone to use my team’s codebase and not run setup script and instead create their dev environment as they like. In practice I’ve pretty much never seen this be done across several dozen contributors we have. The only place I recall feedback to adjust setup script was a developer who preferred conda instead of venv environments and I tweaked environment check rule.

If context on codebase itself helps, I mainly work on ML and the codebase is divided into library/infra code and other team application code. Other team code defines their own models/features and is developed in a more experimental/data science like fashion and making it easy for them to run main top scripts and see their modeling code changes is valuable.

ofek · November 24, 2023, 11:05pm

Yeah what Mehdi mentions (monorepos) is the reason I keep saying that a new syntax to define editable installations is not enough and rather you need the concept of workspaces. The reason for this is that simply installing the target path as editable is insufficient because what you actually want is to track arbitrarily nested changes to dependencies and keep those in sync.

I would advise to not have the concept of editable here and if you really want that, then spec out workspaces first in a separate PEP.

Note that this is not just for large monorepos but for any number of local packages that are developed in tandem. For example, when I implement workspaces in Hatch that repo will use that feature immediately for dependence on Hatchling rather than the hacky/manual way I do that now.

brettcannon · November 24, 2023, 11:25pm

I would argue for Paul’s view. If your workflow is that specific, then it’s up to your development practices to standardize that and enforce it as Mehdi suggested. You already have other practices I’m sure which are not captured by metadata, and that should be okay. From my experience with VS Code, at some point you have to admit that trying to support every single potential workflow is never-ending and it can lead to a mediocre solution for everyone.

And just in case people don’t know, that’s probably workspaces like how Rust does it. And as Ofek mentions, our tricky bit is that we have to handle dependencies in a flat namespace unlike Rust, and that would need to be picked up appropriately rather than simply having files that get edited also show up when you run the interpreter.

EpicWink · November 25, 2023, 12:50am

I think it would be fine if the PEP suggest (but not require nor recommend) installers have a command-line flag or other configurable which can install some or all path dependencies as editable.

This means those who don’t care about bringing their own workflow can just follow the command documented in the README to set up their environment, and that will install as editable.

ofek · November 25, 2023, 12:57am

I would be against such text for the current PEP in all (2) circumstances.

If by editable you mean simply pip install -e ... then I would also be against that because that is of course insufficient as Brett explained further. For example, say your dependent path adds a new dependency, you have now broken your setup.

If by editable you mean workspaces or some other word that has no precedent in the current ecosystem that implies recursively ensuring dependencies are met always, then I also would be against that because there is no such extant concept for Python.

sirosen · November 25, 2023, 1:42am

I’m not ready to drop editable right away. Poetry and PDM both support it in their dependency specifiers – perhaps because they also concern themselves with commands which manage virtualenvs.
I’m open to the idea of removing it, but don’t want to do so in haste. I’d like to make some more progress on the other missing pieces of this spec to give this particular item time to breathe.

It is true that editable installs make it possible, at least with currently available tools, to create a broken environment. When an editable-installed package has updates to its metadata, it often requires reinstallation. However, we also have some relatively new tools in Poetry and PDM which have both chosen to support this in package specifications, and tox allows it specifically for . by means of usedevelop=true.

This is a subject on which I believe that the Poetry and PDM devs’ input would be invaluable. I’m concerned that removing this capability will pose an issue for those projects, but perhaps they can handle control for editable installs via other means.

oscarbenjamin · November 26, 2023, 1:31pm

I’m not sure whether editable installs make sense in the context of this PEP but if not then I also am not sure that I understand how the tooling is supposed to handle editable installs. I use editable installs a lot but often when I use them I am intentionally creating an environment that is not really reproducible and that cannot be created using ordinary dependency specifiers and that might well be considered “broken” in other contexts.

Tools like poetry, nox, hatch etc work nicely in the common case when you are only working on a single downstream project and can treat all of your dependencies as just being upstream and immutable. Where I find that these tools break down is when I want to do something slightly more complicated like working on changes to two projects/codebases in parallel. At this point I find it is just much easier to break out from the environment managers and use pip and virtualenv directly. It is nicer then if the dependencies are in requirements files so that I can do e.g.:

pip install -r proj1/requirements-dev.txt
pip install -r proj2/requirements-dev.txt
pip install -e proj1
pip install -e proj2
pip install dependency==x.y # override the requirements files in some cases

Now I can make changes to both proj1 and proj2 and test them both together in a single environment. The thing that I like about pip in this situation is that it simply installs what I tell it to install regardless of whether that might be considered “broken” in other contexts. Maybe I know that this combination of packages is broken and maybe that is why I want to create this environment so that I can work on fixing it! The intention here might be that I end up submitting dependent pull requests to both proj1 and proj2 but in the early stages I need to make changes to both simultaneously to test that the end result will actually work.

I hope that the dependency groups here can can be used for this sort of thing because currently if you use tools like poetry or hatch then they end up owning the dependency configuration making it inaccessible to other tools (unless they are also duplicated in requirements files). Is the expectation here that I should be able to do something like this:

pip install --dependency-group=dev ./proj1
pip install --dependency-group=dev ./proj2

Here I want to access the dependency groups from “outside” the project i.e. there is not a unique pyproject.toml in this situation. I guess this is a bit like the monorepo case except on a smaller scale where there are only two or maybe three projects that I want to work on but other dependencies can all just be considered upstream.

Another feature of requirements files that I like is being able to reference one requirements file from another like:

# requirements-dev.txt (also includes requirements-basic.txt)
-r requirements-basic.txt
pytest
...

Will it be possible to achieve this sort of effect with dependency groups in pyproject.toml or does it require duplicating any common dependency specifications across all groups?

pf_moore · November 26, 2023, 2:03pm

I wonder whether there’s a more fundamental issue here, actually. We only need editable installs in the first place in order to allow the developer to edit the code of a project while still having it “installed”. Do any other languages have this sort of concept? Clearly compiled languages like Rust and C don’t. I don’t really know how most other languages without an explicit compile step work - is an “editable install” a distinct concept in Javascript, Ruby, or Julia, for example?

Maybe we’d be better trying to find an alternative to the idea of editable installs, rather than continuing to layer workarounds on top of each other to keep making them work in contexts where they aren’t actually that appropriate?

Having said all of this, practicality may well beat purity here - we’re not going to replace editable installs any time soon, so maybe a slightly clumsy hack is sufficient here. As long as we acknowledge that this might be something we’d design differently in the “ultimate one packaging system to rule them all” long-term solution, maybe that’s enough for now?

oscarbenjamin · November 26, 2023, 5:46pm

What most languages have is tooling for fast incremental rebuilds. The question is how in Python you could rebuild a large environment with many packages efficiently each time you you change one line of code in one single local project. With editable installs this is instantaneous.

pf_moore · November 26, 2023, 6:14pm

Thanks, that makes sense.

Of course, the issue is that this isn’t true if the metadata (or the source of a compiled extension) changes. Editable installs aren’t sophisticated enough to detect this and fix the dependencies. So I think my point still stands, maybe instead of editable installs on their own, what we need is some form of incremental install mode. That could work with editable mode, in the sense that an “editable install” creates some sort of metadata that allows the installer to do the incremental install, and non-editable installs do non-incremental installs.

As I say, this is way more than could be done in this PEP, so the key question for now is whether we’re comfortable with viewing editable installs as a “good enough for now” solution, and therefore having support for editable installs in dependency groups is OK from a “practicality vs purity” standpoint, because the “pure” solution would be very different from what we currently have in editable installs.

I should say that for my personal workflow, I very rarely use editable installs in practice. I mostly test using nox, which fully rebuilds my environments anyway, and for adhoc debugging I use adhoc approaches

ofek · November 26, 2023, 9:23pm

I don’t want to cause a digression here but can you briefly explain how you cope with this during development? I couldn’t imagine having to wait for the entire wheel building process every time I changed a line of code. Isn’t this slow?

pf_moore · November 26, 2023, 9:33pm

I’m not sure I understand. When I make a change, I run nox -s test. It’s typically fast enough for me. How long does building and installing a wheel take for you? For me, it’s no more than a few seconds.

The exception being when I work on pip, where the 30+ minute test suite isn’t something I can realistically run frequently… On pip, I mostly try to keep my changes small enough to be as near to “self evidently correct” as I can manage, and I push to my github clone frequently and let CI do the heavy lifting for me. And in any case, the time it takes to reinstall pip is way less than the time the test suite takes to run, so an editable install wouldn’t gain me much anyway. (Disclaimer: I do actually have an editable install of pip in my working environment. But in practice, I very rarely use it and it’s not essential to my workflow in any real sense).

oscarbenjamin · November 26, 2023, 9:40pm

Note that meson-python has this and it is used by e.g. NumPy:
Editable installs - meson-python 0.16.0.dev0.

As noted there the implementation still does not handle project metadata like changed dependencies but the general principle of auto rebuild seems workable. It probably needs something on the frontend side like a rebuild-environment command that could check project metadata and recalculate dependency information. Also as noted there the frontend needs to provide some mechanism to preserve the build environment or otherwise it can only work with build isolation disabled.

I am guessing that the environments you are using are quite small or you might find nox rebuilding them to be problematic. In a new project of mine I have been testing out poetry, nox, hatch etc. Having tried it for some time I decided not to use poetry and then switched to hatch (poetry just seemed to make everything harder rather than easier). I then tried hatch’s environment support and quickly decided not to use nox any more because hatch is so much faster. Previously nox would take 3 minutes and nox -r would take 1 minute (-r means to reuse the environments rather than rebuild them). Now I have a script that runs more or less the same commands in a few hatch environments and takes 10 seconds to run basically all checks (tests, coverage, doctests, build docs, type check, lint/format).

Still hatch checks whether the environments need to be updated but if they don’t need to be updated then it just reuses them and somehow it does this much faster than nox -r which reuses the environments unconditionally. The resources wasted by nox rebuilding all these environments dwarfs the actual time taken to do the things I need doing like actually running the tests. I also now have 600MB of hatch environments rather than 1.4GB of nox environments because with hatch I can reuse the same environments for multiple tasks. The only downside I see so far to using hatch here is that with nox I could use requirements files but with hatch I can’t. That is not a huge problem but it would be better if the dependencies were accessible to other tools without needing to be duplicated (hence dependency groups or requirements files).

Bringing this back on topic though as much as I do want editable installs with incremental rebuilds etc I am not sure that it is something that belongs in this concept of dependency groups. I think that this is a decision that I would want to make when creating my environment without it being baked into vcs-tracked files like pyproject.toml. I probably want to say to my environment managing tool something like “make a temporary local environment and install the development dependencies but install X in editable mode from …/X”.

ofek · November 26, 2023, 10:12pm

That’s what I thought and yes a few seconds is actually too much for me personally.

The next release (within the next week or two) won’t do this when dependencies haven’t changed! This was an oversight on my part.

You can actually, see the hatch-requirements-txt plugin listed here: Reference - Hatch

sirosen · November 26, 2023, 10:39pm

How would you expect the interface to work in such a case, for an environment manager like nox, tox, or hatch?

Suppose you have some declared dependency group:

[dependency-groups]
test = ["pytest", {path = "../libfoo"}, {path = "../libbar"}]

And you want to build a test environment (using the test dependency group, of course!) with both libfoo and libbar installed in editable mode.

This is something that comes up in (usually closed-source) monorepos which ship two or more applications based on some shared “common core”.

If editable isn’t in the spec, there has to be a clear answer for how tools could handle these cases which we can articulate.

EpicWink · November 27, 2023, 1:24am

Does editable declaration need to be standardised? Is there a need for interoperability between different tools regarding which projects need installing as editable? Do we want to enable mono-repos (and other projects with many Python projects) to be supported by Python packaging such that users can bring their own environment manager?