Wanting a singular packaging tool/vision

barry · November 23, 2022, 5:43pm

A problem to be sure, but not intractable. It does get tricky because the mapping from import name to package name may be one-to-many.

In fact, years ago we started to work on something like this to bridge the gap between the Fedora and Debian ecosystems, but I don’t remember whatever happened to that, if anything.

rgommers · November 23, 2022, 6:42pm

Thanks for the suggestion Petr. I suspect this would get one started, but won’t work well in general. Because packages can come from many packaging systems (e.g, any Linux distro may package a Rust crate) and pkg-config tends to be one of multiple supported dependency providers (e.g., Fedora maintainer refuses to ship pkg-config files for OpenBLAS for no good reason, so we have to fall back to CMake or manual scanning of prefix/lib - and I wish I made that one up, but it’s a real issue right now).

It should be at least one of the alternative design options discussed though.

fungi · November 23, 2022, 7:59pm

Mapping import package names to distribution package names (and vice
versa) will in fact be a many-to-many relationship, because some
distribution packages provide multiple import packages, but there
are also multiple distribution packages which provide identically
named import packages as well.

CAM-Gerlach · November 23, 2022, 9:34pm

(https://bugzilla.redhat.com/show_bug.cgi?id=1574517 for anyone else who was wondering about that—sounds incredibly frustrating to have to deal with.)

barry-scott · November 23, 2022, 10:38pm

I maintain pysvn. On Fedora and Debian systems its supported natively.
The RPM or DEB build infra does the heavy lifting.

But on macOS and Windows I have to build a lots of dependencies my self.
subversion, openssl, apr for starters.

On windows I need Visual C++ installed that will not activate until I sign in
once with by microsoft account.

The windows install of pysvn needs to install the C/C++ runtime package, that needs admin access. (Maybe with Windows 11 I can drop this).

On macOS I have to install xcode and then install the command line tools via xcode menus.

The way I build them involves custom arguments to ./configure or its equivilent,
and sometimes patches to fix issues in the dependences. I need to set CFLAGS
as well for the C++ code to work.

That would have to be expressed in the markup.

I would love to not have to maintain the pysvn build system!

pradyunsg · November 23, 2022, 10:41pm

This is a digression, so I suggest we don’t discuss this at length here but…

xcode-select --install can be run on the CLI to install command line tools, without needing to download XCode or interact with the UI IIUC.

barry-scott · November 23, 2022, 10:44pm

I have hit issues with that command in the past. I’m not sure if that always work.
I maybe remembering what it took to install command line on older xcode versions.

steve.dower · November 23, 2022, 11:14pm

Actual build steps are best expressed in a CI system’s markup language, and chances are if you pick a common enough CI system, you’ll be able to assume a lot of the things you want. (The only way this goes away is if you lose the ability to choose your own CI system - as long as you’re free to choose, you have to choose.)

Listing the likely reasons why a build failed would be a huge improvement for pip. If they’re listed in a way that Conda (or a Linux distro) can translate into the packages required to be installed, even better.

barry-scott · November 23, 2022, 11:20pm

Isn’t the idea that if you use the new super installer for a python package it will build it from source on the users machine? How does that relate to using a CI pipe line?

CAM-Gerlach · November 24, 2022, 12:56am

Not 100% sure if its what @steve.dower means, but under normal circumstances with PyPI and distros, and always with Conda, your users won’t be building your packages from source themselves but rather installing the binary packages you build in CI (via cibuildwheel, Conda-Forge, or the distro’s infra).

Also, I’m not sure what is being discussed here is a “new super installer” as opposed to specifications that will provide the necessary metadata/hooks/interoperability, etc. needed for existing installers to interoperate in a standardized way, like build frontends and backends do via PEP 517/pyproject builds.

barry-scott · November 24, 2022, 8:11am

I am also unclear of what is going to be out come of this discussion.
My purpose in posting how pysvn is built is to explore how the
ideas here would apply.

One of the desired outcomes is that users that are not packaging experts
can publish new packages. They are unlikely to have CI pipe lines for
a wide range of OS targets I would have thought?

CAM-Gerlach · November 24, 2022, 9:18am

Just to clarify, when I said “I’m not sure what is being discussed here” I was using it as a rhetorical device to point out what looked to me like a misunderstanding of the intent of this discussion in a more polite manner than bluntly stating so directly, as opposed to literally implying that I was personally unclear about that point. The precise outcome is still quite undecided—its just not likely to be a “super installer” per say, if that’s what you’re expecting. But tomorrow never knows…

Many/most non-trivial FOSS projects are already using CI anyway, and it only takes a couple of lines of CI config to have matrix of jobs over the three OSes that matter, and (at least) the highest and lowest supported Python versions, usually for free (e.g. with the widely used GitHub Actions). Pure-Python projects can just have a single set of distribution artifacts built and uploaded locally, but the discussion was focused on projects with C extensions and non-Python/binary dependencies (including your particular project), in which case multi-platform CI is typically the only realistic/practical way to build your wheels.

Nowadays cibuildwheel make it about as straightforward as it can be to not only build your binary artifacts on all the relevant platforms and Python versions, but also upload them to PyPI—you just drop in their GHA workflow and you’re done, it will automatically build and test your wheels on every push and PR, and make a release automatically wnen you push a tag.

For Conda-Forge, everything is of course built for all compatible OSes, versions and arches automatically as a matter of course, with no need for additional setup. And for distros, they all have their own CI systems or equivalent that build, test and release their packages.

encukou · November 24, 2022, 9:22am

I did say you could get pretty far, not that you could solve the problem.

The closest I’ve seen to solving the problem, although not cross-platform, is RPM. So let me write a bit about prior art, for inspiration:

RPM packages have “virtual provides”, a full-fledged variant of Provides-Dist, which allow packages to claim multiple names. Traditionally, parentheses are used for alternate namespaces. For example, you can install pip with:

# dnf install python3-pip               # (actual package name)
# dnf install python3.10-pip            # (alias)
# sudo dnf install 'python3dist(pip)'   # (PyPI name)
# sudo dnf install /usr/bin/pip         # (Files are virtual provides too)

And you can also do:

# sudo dnf install pkgconfig(lapack)    # (pkgconfig name for a native library)
# sudo dnf install crate(starship)      # (Rust crate name)

Some of these aren’t human-friendly, but they do make automatic dependency generators for the various ecosystems relatively easy. At least for the ecosystems that can be automated easily :‍)

Well, there’s one other crucial feature: arbitrary expressions in requirements, to express that you need “this or that”. There’s and/or for that, but turns out you also need if, with, unless: rpm.org - Boolean Dependencies
The resolver and the package indexes support that.

Add that to a cross-platform system, and, done!
Or start small and leave the hairy cases unsolved for now.

I didn’t know about this issue, but FWIW, Honza Horák happens to be my manager.

CAM-Gerlach · November 24, 2022, 10:03am

Does each Fedora package specfile define the namespaced virtual provides it maps to (besides just special-cased aliases)? Is this automated? Or somewhere in between?

It looks like DEB has virtual provides this too, though perhaps not including the things like namespaces and such?

That does seem quite useful and that (OR dependencies in particular) have come up a number of times on other issues (including just in the past day or two). But in terms of standardizing something, that seems like a whole big project all its own, as its effectively a PEP 508 successor. And probably a more difficult one, since it would probably require more changes to existing tooling to support the substantially expanded syntax.

(Maybe worth someone giving it a nudge, since it seems Honza, Ralf and essentially everyone else on the issue agree on a path forward there, and the maintainer hasn’t responded?)

encukou · November 24, 2022, 10:38am

The Python ones are automated – taken from pyproject.toml and dist-info. Test deps are more manual, but can be read from tox config.

Not sure about DEB. But the namespaces above are just a naming convention, not really a feature of RPM.

Hence a subset that would get you “pretty far” ;‍)
Python tools don’t need to install the stuff, and can punt on hard cases (letting the build fail later) for now.
Edit: And non-Python tools (Conda/Fedora) need to have some kind of override mechanism anyway.

pitrou · November 25, 2022, 8:55am

Another UI improvement would be to add a pyproject.toml field for specifying a URL to the install docs. Then when a package fails building, pip can display that direct the user to those install docs as part of an error message.

CAM-Gerlach · November 25, 2022, 10:06am

Adding a whole new Pyproject metadata key / Core metadata field would of course be a lot more than just a UI improvement, as it requires a PEP, then modifying multiple standards, and then updating all the various build backends, build/integration frontends, and Warehouse (PyPI) to add support for it (most of which right now still don’t support PEP 643/Metadata 2.2, much less PEP 685/Metadata 2.3), and only then, after support has sufficiently percolated thorough the ecosystem, would authors be able to update their packages accordingly.

However, if something like this is desired, there is a much more viable approach—just use the existing urls Pyproject metadata field. Specifically, you could have pip show the URL(s?) with a name matching the pattern instal*, and document that users should name their installation guide URL “Installation” (or similar) for pip to display it automatically on build failure (or even offer to open it using webbrowser).

This is implementable by both authors and pip right now without any of the above, and will both already work with existing packages that define it with no author-side changes, and carries an immediate benefit to authors as it will be displayed on PyPI and is already consumed by other tools.

encukou · November 25, 2022, 12:02pm

The build-time metadata could be a tool-specific option.
AFAIK, any tool can even declare that [tool.foo.native_deps.v1] has some well-specified semantics and other tools are welcome to use it (and collaborate on v2).

The URL idea sounds good too. Most of it can also be done in build tools rather than pip, since pip will show the output for a build error. You probably don’t want to open a webbrowser from the build, but the message could better correspond to what exactly went wrong.

b11c · November 25, 2022, 4:18pm

I quite like this approach overall, but I’m a bit disappointed with the idea of creating yet another tool.

That being said, you said that this pyup would be an equivalent (more or less) of pyenv - a possible low-lying fruit might be writing a plugin for pyenv which would support all the other functionalities described…

b11c · November 25, 2022, 4:23pm

Python reaches the import, sees the missing requests dependency, goes out to PyPI and installs requests

The problem here is that names of PyPI packages and the namespaces they install are not necessarily in alignment, which makes it difficult to automate this process.