Manually adding dependencies to pyproject.toml

I’ve been working for some time on my workflow for packaging Python applications (not libraries, not published to PyPI). I currently use Poetry for development, but I don’t expect my users to have Poetry, so I provide a script that 'pip install’s the wheel in a virtual environment. I supply a requirements.txt along with the wheel to pin the dependencies (as pip can’t use poetry.lock).

I dislike some aspects of Poetry’s dependency handling such as the non-standard version specifiers and problematic max-version capping. I was looking at moving to using hatch and/or pip-tools which both require you to manually enter your primary dependencies into pyproject.toml.

I assume if my users will only ever use the requirements.txt pinned dependencies, I could just specify version ‘*’ for everything in pyproject.toml, but this feels wrong (and likely is objectively wrong if the dependency has been around for a while). My proposed approach is to specify >= {whatever version I have in my build/test environment}. This is effectively what Poetry does minus the maximum version cap and odd syntax. Is this what most people who use these tools do? If so is there any more elegant way to do this than running ‘pip list’ and picking out the packages which I’ve directly imported and typing them into pyproject.toml? This feels clunky compared to ‘Poetry add’.

Yes, >= is almost always the correct answer, if I understand you use case correctly.

The proper way is to go through your application and add to your project.toml all the dependencies which provide the packages you explicitly import. For each package, you find the earliest version (ignoring patch versions) which supports the functionality you use (same with pattern versions which break API, if any), and set that in the specifier.

This would then be used as input when automatically updating the pins in the lock file (requirements.txt, eg using pip-tools). This proper way is more effort than you need to have a working application, however, so up to you.

First, you can use pip list --not-required to only list your top level dependencies. However, my usual workflow is to always add a new dependency to pyproject.toml or requirements.in first and then install it from file (so that’s typically python -m pip install -Ue . for pyproject.toml or python -m pip install -Ur requirements.in). I usually have a Makefile so I can just type make update or something. I guess one could go a step further and automatically run these comments on file change with entr or something else, but I’ve never tried those personally.

This makes perfect sense of course but leaves me wondering how I would determine the earliest version that would work. Perhaps careful study of the documentation, or just created automated tests with every version. Either way it’s a lot of work considering I’ll just be pinning the versions that pip gives me.

That certainly seems like the workflow pip-tools and hatch expect. But how do you decide what version to specify? Do you go and check PyPI manually for the latest version number?

Good question! When I add a new optional dependency during development I usually don’t directly specify a version. Adding version numbers happens before I publish a new version (or make a commit/merge a feature branch to main depending on the project). Then I add version specifiers to every dependency that doesn’t have one yet (usually >=).

Edit:

Just wanted to pick that up:

I haven’t used poetry, but I would assume that it isn’t doing any of this either. It probably just checks what’s the latest version compatible with all your other packages and your Python version and locks that (please correct me if I’m wrong here). My approach of not directly specifying versions is doing the same, just that I let pip/pip-tools resolve dependencies to give me the latest compatible version of a package.

That is indeed a good question. For a newer, smaller project where I’m just adding them, my approach is rather imprecise and informal; I generally skim the changelogs for changes that seems likely to be relevant to the functionality I’m using and lower-cap it at that, with consideration for the age of the capped version (with the breakage-likelihood criteria threshold decreasing with increasing age of the relevant versions). Failing that, the current major version serves as a decent first guess, provided it isn’t too new or old.

However, those are all rough heuristics; for a major project (esp a library) the recommended approach would be to actually test your project with the deps installed at the earliest versions that satisfies your dependencies, which should catch most cases, but is a non-trivial amount of work and probably not worth it at least in your specific case.

Even for large, long-established Python applications like Spyder, this is mostly driven by when we need to rely on a newer feature or a user reports a bug specific to an older dep version rather than full CI matrix coverage of this, which generally worked adequately—in practice, we see far more breakage from newer versions of things than older (which our CIs immediately catch). Nowadays, we have things constrained fairly tightly (though not pinned) even in our packaging metadata and put a fair amount of work into maintaining those, though our recommended install methods install to an isolated environment with pinned deps that are locked and continuously tested against.

It’s good to get some validation that this isn’t a completely trivial question. In the tooling world there seem to be two approaches.

Poetry and PDM both provide a command to install a dependency which adds it to pyproject.toml, but these are pretty heavyweight tools with their own dependency resolvers.

The lighter-weight tools like pip-tools, hatch, pyflow have nothing particular to say on the matter. The docs just say to add dependencies to pyproject.toml and don’t offer any opinion or advice on how (not a criticism, just an observation, no reason that they should offer this).

It does feel like there’s room for a middle option which uses pip for resolution and installation but completes pyproject automatically. Perhaps I’ll just write a script to do this…

pip-tools ultimately relies on pip for its resolving, AFAIK, so it would depend on that feature in pip, presumably (unless it would do some pre-processing). There is an open feature request for this in pip, if that’s what you’re asking for.

Hatch might when lock file support lands if and only if tomlkit is no longer buggy for me.

That would be cool, hatch does seem like a good fit for this.

Would it be useful if I raised an issue for this? Depending on future workload and competence I would be happy to work on it and submit a PR.

I feel your pain around toml, it’s great that we have tomllib in 3.11 but frustrating that we still have to manage a dependency in order to manage (read and write) the file that manages dependencies! I considered just writing a script to manage pyproject but it was this element that deterred me (and convinced me that hatch would be a good home for it).

Sure, but open a discussion instead.

Done