PEP 771: Default Extras for Python Software Packages

I am now working on implementing some changes to the PEP based on all the discussion above, and as part of this I would like to request some specific input/feedback.

One of the sections I am working on adding is investigating the alternative approach of ‘splitting’ a package into two mentioned by several people above. This isn’t really splitting as such (because none of the Python code needs to be split), but more e.g. renaming package to package-core and then making package a metadata-only package that would install e.g. recommended dependencies for the average user.

What would be helpful at this point is to understand why people who are keen to use default extras have not used this approach already. For instance it would be helpful to know whether this is due to limitations in a specific tool or set of tools, a cognitive/philosophical issue, or due to increased effort.

It would also be helpful to hear from people who have split their packages in this way and whether you have been happy with it or whether you have run into issues.

I’m not too keen to get into discussions at this point about whether the motivations for not doing it this way are right or wrong, I am more interested in the perceived issues, and only once we have collected a few to look into whether this could be solved by e.g. better documentation, improvements to packaging tools, or whether there are more fundamental limitations.

4 Likes

I’m more “cautiously interested” than “keen”, but I can give my reasons for not splitting up check-jsonschema.

  1. There are multiple distribution channels, not only pypi. The tool is available in homebrew, the AUR, and via GitHub as a pre-commit hook package. If I split the package I’m not sure how these various channels might diverge.
  2. The current package has a package name which matches the (only) CLI entry point it provides. I consider this helpful for users and the UX for uv and pipx clearly agrees. Changing the name of the “core” part – the CLI in this case – is therefore a non-starter.
  3. There are two optional supported JSON5 libraries. I’ve avoided picking one or the other in case they diverge in ways that users care about, but would like to provide the Cython one as a default. Will a wrapper package which hard requires that be as easy for users to understand as an optional default?
  4. The package currently depends on a YAML library because it’s needed for pre-commit hooks for various YAML files. I’d like to make that dependency soft, so you can install without YAML support, but don’t want to break pre-commit usage. If I make a mirror for pre-commit, I’m asking all users in that ecosystem to migrate their usage.
  5. The push and pull off these various cases sounds less like “one core package and one default-provider” and more like “a small ecosystem of packages to provide different dependency use cases”. That’s way too much effort for things which are each, individually, pretty marginal.
3 Likes

I split flit-core out of Flit, and I think there are several packaging tools with a similar split (e.g. poetry-core, hatchling). Although flit is not just a metapackage, it contains additional code, and I think that’s similar for the other tools. I’ve been fairly happy with this approach.

There are other projects where I might have done this but haven’t. It’s both increased effort as a maintainer, and increased scope for user confusion. Especially around versions and upgrades: e.g. do you release foo 1.0 once which depends on foo-core and rely on it pulling in the latest? Or do you always release both packages with foo X.Y depending on foo-core==X.Y? (FWIW I do the latter with Flit)

5 Likes

Woo! This is so cool!

We really need this feature. We want to promote optional-but-recommended GPL extras without making our package uninstallable for companies with a (probably idiotic but still existing) anti-GPL policy.

This feature would make that so much more convenient for people!

3 Likes

I haven’t done it because there’s a big step in complexity from “repository publishes one package” to “repository publishes two packages” (lots of metadata files and build processes end up needing to be duplicated).

If I could set a field in pyproject.toml that said “publish a core package with no default extras and the main package that depends on the core package with the default extras enabled”, I’d be much more willing to consider the approach.

1 Like

A build backend could implement something like that, without needing a new standard:

[tool.my-backend]
core-package = "mypkg-core"
include-extras = ["gui", "default_db"]

There would be some rough edges with the UI - the build_wheel hook only builds a single wheel, so you might need a config setting to build the core package - but it’s certainly possible.

My gut feeling here is that the problem is real, but we’ve still not found the “one, obvious” solution.

I think the problem is that default extras are usually the right thing to install when building an application, not when building a library. Because Python doesn’t have the distinction between an “app” package and a “lib” package that other ecosystems (such as Rust) do, we can’t cleanly distinguish the two use cases. (We also don’t have Rust’s support for having a project with both app and lib components, which is why the solution I described above has those rough edges…)

3 Likes

Hi everyone, just as a heads-up a new and updated version of PEP 771 has been published, and I have now opened a new thread:

Let’s use that thread to continue any discussion related to PEP 771.

Many of the issues raised in this thread are now addressed in the updated PEP. However, if you feel like any of the issues or ideas you raised here have not been addressed, please summarize them in the new thread for clarity and we can continue to discuss them there.

Closed per OP request.