Help packaging optional application features, using extras?

CAM-Gerlach · March 5, 2022, 8:14pm

Thanks for the detailed and well-written writeup; its very helpful understanding your situation and

I’m not entirely clear if you mean “moving away from having a setup.py” in terms of making your existing config declarative, or switching to a build backend other than Setuptools.

If you want to move away from a dynamic setup.py and to a static, declarative setup.cfg, basically everything in your existing setup.py is directly portable to the same. A package like setup-py-upgrade will do most of the heavy lifting for you.

You could also consider switching to another build backend, like Poetry, which makes much of the rest of this easier.

A few other things I noticed:

The pyproject.toml is missing the most important part, the build system config; see the appropriate section of the official packaging tutorial.
It looks like you’re using PEP 420 namespace packages? They aren’t that commonly used these days, can make things much more complex, and it isn’t clear to me why they’re needed within the same project.
Consider the PyPA-recommended src layout, which combined with the previous should help avoid some of the current complexities in your find_packages() config, as well as other issues.

This indeed sounds like a pretty idiomatic application for Extras.

Yep, that’s exactly what extras are for.

Right now, extras doesn’t currently offer the ability to remove deps from the default, as it gets rather complex if you specify multiple extras, some of which add deps and some of which remove them, but there have been plenty of proposals and discussion on potentially adding a “default” extra that would be a minimal set of deps for the package, upon which the other extras could build:

Opinions differ as to how idiomatic this is. Some feel it isn’t really what extras are intended for, and better ways of doing this are availible when using more advanced build backends like Poetry. On the other hand, if the latter isn’t being used, it can be more convenient, accessible and less duplicative than manually maintaining and installing requirements-dev.txt, or as an abstract equivalent to the frozen deps in that file.

Often you’ll see test, lint and other development-related dependencies (basically the contents of your requirements-dev.txt file) consolidated into a dev extra (which is easier now that at least with current pip, extras can depend on other extras, reducing duplication). The intention is to allow developers/contributiors, and potentially repackagers, who are e.g. installing from a GitHub clone, Git tarball, sdist, etc. to more easily install and run the tests without having to manually install (and maintain a duplicate list of) dependencies in requirements-dev.txt, which isn’t always shipped or easily available as a file.

Yes; if you have hefty package data that you want to separate out along with them, you’ll want to move that to a separate package.

That’s right; extras are really just (more or less) groups of optional dependencies.

Yes. Per the current Entry Points spec:

Using extras for an entry point is no longer recommended. Consumers should support parsing them from existing distributions, but may then ignore them. New publishing tools need not support specifying extras. The functionality of handling extras was tied to setuptools’ model of managing ‘egg’ packages, but newer tools such as pip and virtualenv use a different model.

I’m a little confused by this, sorry. In general, your implementation of the extra is going to depend on the main package, creating a dependency graph as follows:

             cli_extra
             /       \
            /         \
           V           V
main_application    < cli_extra deps >
        |
        |
        V
< main app deps >

So I’m not sure how your case differs.

Yeah, this would be a good way to do it, as I understand. Keeping your package more modular is generally recommended, as it decreases coupling, aids reusability, replaceability and pluggability, and allows downstream users to be more specific about their own dependencies, but it does increase overhead to some degree.

What you’re talking about here is basically a “lite” version of the “monorepo” approach, which has both upsides and downsides. It isn’t very common in Python packaging, but it’s what Google and some others favor, particularly in other languages and large corporate environments. It makes it substantially easier and more atomic to make broader changes that affect multiple different, discrete parts of an application/system. On the other hand, there are some usability tradeoffs in your development workflow, as most tooling isn’t really designed around it and contributors won’t expect it. See Google for some more detailed perspectives on this.

While there are certainly reasons not to use a monorepo, it is perfectly possible to still install from the GitHub URL by using the subdirectory parameter. See the pip docs.

As I understand your needs, it is more or less supported, it just requires some restructuring of your project, as many others do. I’d generally advise trying to advise against rolling your own tooling unless you’ve exhausted other options (which there appear to be a several), the benefits are clear (in this case, they don’t appear all that decisive), and you’re willing to accept the long-term maintenance risks and costs of doing so.