Should I be pinning my dependencies?

The recent threads on PEP665 got me wondering whether I should revisit my decision not to pin dependencies when I distribute Python applications.

If you want a long read, I’ve documented my current workflow (which is in its infancy and still being refined) here Snakes on a Plane: Getting Python to Users

TLDR: I use Poetry to maintain pyproject.toml and build a wheel, then I provide my users a script that pip installs the wheel into a venv. So pip resolves the dependencies according to the constraints in pyproject.toml but it is not deterministic.

The reason I did this is partly because at the time I wasn’t really sure how to pin dependencies when the user isn’t installing using Poetry.
I have subsequently realised I can export a requirements.txt and explicitly pip install that before installing my wheel. However I am still somewhat nervous of doing so because the pinned versions could conflict with the user’s Python version, OS, or even be unresolvable according to pip since they were generated with Poetry (my tox testing would hopefully show up at least two out of three, but still…)

There is a lot of material along the lines of ‘always pin everything otherwise your testing is worthless’ out there. But I also see people with issues trying to use lock files when the deployment environment isn’t guaranteed to be identical to yours. I guess I could have my install script attempt a pinned install then fall back to pyproject.toml constraints if it fails…

I guess pinning also makes me a little nervous as it kind of obligates me to keep making releases with updated dependencies (although Poetry’s default max version capping has a similar effect)

Any thoughts, advice, criticism welcome.

Also I’d been considering ditching Poetry and using Flit instead for packaging so I could handle the dependencies more simply through pip. I think this is what Brett Cannon does. I’m also intrigued by PDM.

1 Like

The rule I follow is: If you want users to be able to install your project in the same environment as other projects, you should not pin your project’s dependencies, as that just leads to inevitable conflicts in the event that someone tries to install your project alongside another project that’s also pinning its dependencies.

The reason I did this is partly because at the time I wasn’t really sure how to pin dependencies when the user isn’t installing using Poetry.

I’m not really familiar with Poetry, but can’t you just specify the exact desired version of each of your project’s dependencies in pyproject.toml? That would mean that the resulting wheel would list the given version for each dependency in its metadata, so pip would install only the version you specified.

2 Likes

Welcome!

Yes, exactly; it should be possible under tool.poetry with no need for a separate requirements.txt. And if you’re installing into a new clean venv (as your script enforces) and grabbing deps from PyPI, the only things that change with the user environment are the OS platforms and Python versions you support, which you should be (and sounds like you are) testing in your tox/CI/etc. matrix (at least {py_lower, py_upper} * {win, mac, linux} should get you almost all the way there, and is only 6 different combinations. While theoretically poetry’s and modern pip’s solver could resolve to a different combination if given your abstract dependencies, if given your pinned deps you shouldn’t see a difference (outside of extreme corner cases).

Won’t this just pin the primary dependencies, not the lower-level libraries upon which they depend? The poetry.lock file or a requirements.txt generated from it specifies every package anywhere in the dependency graph. I guess maybe you could somehow put all this in pyproject.toml but I assume not otherwise there would be no reason for PEP665.

You’re conflating the concept of dependency specification for packages vs applications. See PEP 665 Terminology section.

The pinned dependency file you speak of (e.g., requirements.txt from pip freeze or poetry.lock) is known in PEP 665 as the lock file. This includes all dependencies — both high- and low- level. These specification are meant for applications.

On the other hand, the package dependency file like the pyproject.toml (i.e., as used by poetry or PEP 631) contains top-level dependencies (there are sometimes good reasons to constrain lower-level ones). These specification are meant for packages.

Back to can poetry. Yes you can pin dependencies. Theoretically, you can pin all low-level dependencies too. However, don’t do; poetry already do that for you with a lock file.

Absolutely, and that’s why I was questioning the previous recommendations to use pyproject.toml in that way. However poetry.lock is not a perfect solution as it isn’t part of the wheel metadata and cannot be used by pip. My target systems/users would all need Poetry as a prerequisite. I really want to distribute apps such that the only prerequisite is Python itself.

The way I see it (and others as well I think) most of the mainstream Python packaging tools work very well for libraries, but quite poorly for applications. I would say that a bit more research is needed to publish well packaged Python applications.

Some ideas:

5 Likes

Thanks, I found pex in my research and was initially quite excited about it but I found it hard to pickup and was concerned about picking a format with low adoption and possibly a high risk of becoming unmaintained, I’ll take a second look.

Yeah, and there’s a fundamental distinction between abstract requirements, e.g. the top-level packages your project actually uses and their min/max supported versions, and concrete requirements, e.g. the specific solved set of all dependencies, their exact versions and (often) their specific build/version (which are often platform- and Python version specific), output by the dependency resolver, that satisfy the abstract requirements. The former you specify and go in install_requires, while the latter are output by pip freeze, pip-tools compile, or more modern tools into a requirements.txt lock file or bespoke formats, e.g. pipenv lock, Poetry lock, etc. However, as you mentioned, the only requirements considered by pip are those in the package metadata (e.g. install_requires in Setuptools, the equivalent in Poetry, etc). You can specify your top-level dependencies tightly that you directly interact with, but this doesn’t necessarily rule out differences and potential resulting issues in your lower-level ones.

For applications, aside from those more modern, far-reaching Python-oriented solutions, there’s also a number of ways to generate a regular OS-specific standalone installer that bundles Python and all the needed dependencies within the package.

I’m one of the developers of the Spyder scientific environment/IDE, and we had to grapple with this ourselves. See this issue for a discussion of some of the main alternatives. We ended up going with Pynisist for Windows, Py2App for macOS and falling back on just Anaconda, plus the distros and pip on Linux; see our Install Guide for an overview of our (many) current install options, as a working example of some of the many ways you can deploy your application.

As a quick overview, popular options for that include:

  • PyInstaller Most popular option, relatively well documented and maintained, and cross-platform.
  • Conda Constructor Cross platform, modern, well maintained and especially useful for scientific applications, but needs conda packages to work.
  • Py2App (Mac). Fairly well maintained enough. What we use for Spyder on macOS; see our repo here for more on that.
  • Pynsist For Windows, slower development, but still quite workable and has some advantages over other options. What we use for Spyder on Windows.
  • Py2exe (Windows) Still seemingly somewhat maintained, but seems somewhat outdated and relies on deprecated distutils. Would not recommend.

And of course for Linux, there’s distro packages and Anaconda.

Best of luck!

2 Likes

Thanks, I’ve used pyinstaller a bit and it’s great when you can be pretty certain your target platform is the same as the build platform. It’s also super useful when the target platform doesn’t have pypi access etc. I don’t think I’d want to use it for everything though.

Also good to hear devs of established applications have the same issues as me (I’m a part time developer, full-time product manager, mostly making utilities to help my team run our systems)