Wanting a singular packaging tool/vision

hroncok · November 25, 2022, 11:56pm

Just here to actually answer a question I kinda think @encukou misunderstood.

Most of the namespaced virtual provides are generated - usually based on the packaged files: for example from pkgconfig files or egg-info/dist-info metadata. This happens automatically as long as a specific RPM “script” that generates the provides for the given namespace is installed when the package is built (and we make sure that e.g. when Python is used during the build the generator for dist-info metadata is also installed).

Some of the other virtual provides (such as the python3-x → python3.11-x alias) can also be generated based on the actual names. But virtual providers can always be added manually.

Since we are getting a bit off-topic, feel free to ping me elsewhere if you want to know more.

CAM-Gerlach · November 26, 2022, 1:44am

Yeah, it could start that way, which can often help prove the practicality and usefulness of the concept, produce one or more working implementations, and (perhaps most importantly) iterate on the details to

I suppose any tool could declare any semantics they want for their tool.<name> table, but at least IMO tools really shouldn’t rely on reading other tools’ sections; instead, other tools can adopt matching semantics for subsections of their own table, and then hopefully standardize them once they are tested, stable and useful. Otherwise, I worry we’re encouraging a regression to the days when the details of one tool’s implementation (Setuptools) defined the de-facto ad hoc “standard”.

barry · November 26, 2022, 9:59pm

I agree. Is there a process for standardizing such commonality? E.g. do we go the PEP route, are there standard sections for general functionality, etc.?

CAM-Gerlach · November 27, 2022, 4:20am

It would either need a new dedicated section, be a new Pyrpoject metadata key/Core Metadata field (under Project), or be part of the yet-to-be-proposed section for extension building. In any case, the PEP process would definitely be the venue to standardize it.

justinhauer · December 6, 2022, 6:46pm

If I could heart this comment/post over and over again I would. As a developer it would make me so happy if something like pyup (like rustup) and cargo existed (like hatch/poetry) and was the default tool for use with python (like rust)

pradyunsg · December 16, 2022, 1:33pm

FWIW, a similar model was proposed for pip a while back, and is basically stalled at the moment: optimize package installation for space and speed by using copy-on-write file clones ("reflinks") and storing wheel cache unpacked · Issue #11092 · pypa/pip · GitHub

BrenBarn · December 29, 2022, 1:56am

Do you ever see conda being usable for python builds that are not themselves managed by conda?

I’m not the one you were asking this to, and I’m sort of necroposting here, but I see in this thread several comments by you basically to this effect. That is, you want whatever conda-like tool may exist to somehow support environments not managed by it, in particular the versions downloadable from python.org or the Windows store.

My question here is, why is the solution to that not to move toward making the versions on python.org and/or the Windows store use conda (or whatever conda-like thing evolves from this discussion)? Personally I have moved to universally recommending conda to anyone who asks me how to install Python, including when teaching classes to students with zero programming experience.

Or, put another way, what is it that you see as valuable about how the python.org/Windows store Pythons work other than the fact that they are the most visible and easiest ways for people to find Python? If the problem is that conda doesn’t work with them, could that be solved by boosting the visibility of conda rather than trying to make it interoperate with the most visible download source?

(Just to be clear about my vagueness, I’m blurring a line here between conda as it exists now and some future tool that might be managed by PyPA and become part of “official” Python installs. But hopefully the general gist of the question is clear: why not make the official Python downloads more like conda instead of the other way around?)

pf_moore · December 29, 2022, 10:25am

Nothing, in principle. You’d have to persuade core devs (and the Linux distros, who use build scripts supplied by core Python) that it’s a good idea, though.

But remember, I’m not the person arguing for “conda everywhere”. My position (not surprisingly, as PEP delegate for packaging interoperability standards!) is that having multiple tools is fine, but we should have standards allowing them to work together.

The “conda vs PyPA” issue is IMO mostly because we don’t have standards that allow the two groups to interoperate. And unfortunately, everyone (specifically the experts, of whom I am emphatically not one) seems to think there’s no practical solution here. So we’re stuck with two independent ecosystems - which (again, IMO) sucks, but doesn’t mean we should try to declare one of them “the winner” at the expense of the other.

rgommers · December 30, 2022, 8:59pm

Now available, xref Pypackaging-native: content about the state of using native code & Python packaging

pradyunsg · December 30, 2022, 10:49pm

What’s the best place to provide feedback on that site’s contents?

rgommers · December 30, 2022, 10:52pm

Either the GitHub repo for it, or the cross-linked Discourse thread.

mboisson · February 9, 2023, 2:46pm

Disclaimer: I did not have time to read the whole thread. I read a summary on https://lwn.net/Articles/921097

However, I think that the things conda “blames” pip for, are the same things that HPC cluster administrators blame conda for, if you replace “conda” by “system”, i.e.

don’t vendor things available as <conda=>system> packages
do include additional dependencies for those things
link against the import libraries/headers/options used for the matching <conda=>system> builds of dependencies

On cluster environments, we actually ask our users to not use conda, in large part due to these reasons.
https://docs.alliancecan.ca/wiki/Anaconda/en

conda is more like yum/apt than it is like pip, and that does not play well on a cluster.

pzwang · February 11, 2023, 3:20am

Hi, a little late to the discussion here, but one concrete example of this is around Graphviz, an extremely popular (non-Python) tool/library for visualizing graphs.
There are two different Python projects on PyPI: graphviz and pygraphviz. I believe one or both vendor in the actual Graphviz library.

In the Conda world, we actually package the underlying Graphviz library as its own package and then the pygraphviz and graphviz Python packages then refer to it. This is also useful for data scientists who may also use R and the R-related wrappers around Graphviz, and for other C/C++/etc. programs that depend on actual Graphviz.
The conda package for the graphviz Python library then gets renamed to python-graphviz as a disambiguation.

This is an example of how the scope of the problem is bigger than mere compatibility with PyPI. Conda has to live with full awareness of the fact that the Python library ecosystem exists in a much, much larger pre-existing world of non-Python projects…

fungi · February 11, 2023, 2:28pm

This is also precisely how GNU/Linux distributions and Unix
derivatives solve the same set of challenges, which makes it all the
harder for me to understand why people keep insisting the solutions
devised for compatibility between Pip and those other
distribution-specific package management systems aren’t good enough
for achieving compatibility with Conda’s package management.

pzwang · February 13, 2023, 3:44pm

I think part of the persistent confusion here is that Conda is oftentimes regarded as a competing Python package solution to pip, and that perspective is utterly understandable from within the center of the Python development world. (And, to be fair, it did get its start as an alternative to setuptools.) But over time, its user requirements have caused it to evolve into being a “portable mini-Apt/Yum/Homebrew/etc.”, which is a very strange platypus in the software world.

I guess it’s sort of a like an off-road vehicle that has evolved into a four-wheeled helicopter, but because it can drive on roads, urban planners perceive it mostly as a car…

If there is one thing that I wish we could all get aligned on, it’s the fact that:

Python packaging is hard because of Python’s success at solving hard problems in the broader software world;
our exposure to those problems results from a 30+ yr legacy of being an excellent glue language that is accessible to LOTS of Subject-Matter Experts (SMEs) that would normally not touch any other languages outside of DSLs;
Thus, we have to make our own way through this instead of wistfully & enviously looking at other languages that don’t have our problems because they don’t have our success.

Python is the language of ML & AI. This is a stunning achievement for the (mostly) volunteer Python community, and it is due to us (accidentally?) solving hard problems in cross-language, cross-platform software compatibility that most other languages & communities can’t begin to approach. We should absolutely learn/take what we can from others, but I would like for folks to really manifest the realization that we are somewhat on our own because the language is faced with some unique challenges.

pf_moore · February 13, 2023, 3:53pm

Thanks. I try^[1] to always characterise conda like that, and I’d love to be better able to consider conda (from the PyPA perspective) as being more akin to a Linux distro than to a Python package manager. It’s not always possible to keep people focused on that. I’ll keep a link to this post handy to help us all to stay on track!

I might not always succeed, admittedly! ↩︎

pzwang · February 13, 2023, 3:57pm

I think that the things conda “blames” pip for, are the same things that HPC cluster administrators blame conda for
On cluster environments, we actually ask our users to not use conda, in large part due to these reasons.

These are legitimate technical points on what the Anaconda packages are good for, and what they’re not. After reading the wiki you linked, it seems that the crux of the issue is that HPC cluster sysadmin manage the building of a cluster-optimized software distribution, which are the right packages for users to run.

It’s important that we disambiguate between “packages from Anaconda or conda-forge” and “the conda package management tool”. The conda packaging tool can still absolutely be used in the context of a cluster. However, since the cluster admins are generally just solving for package management for the particular flavor of Linux or OS running on the cluster, they should probably just stick with yum/apt/etc. or something cluster-specific like Spack.

If, however, they wanted to provide a compatible set of Python packages that would run on end-users’ personal workstations and Macbook/Win laptops, then conda could actually be a very useful choice. This would simplify the workflow of getting user code running on the cluster, and by reducing the tech skills necessary to use the cluster, it would allow many more people to use the cluster. (I bring this up b/c in previous conversations with HPC cluster admins, they sometimes griped about under-utilization.)

rgommers · February 13, 2023, 5:06pm

I’ll have to disagree with both of you a bit here. Conda really isn’t that similar to a Linux distro or Homebrew, and it’s unhelpful to reduce it to that^[1]. It is similar in the kinds of packages it contains, but it isn’t similar at all in the versions of those packages it gives you, or user-level environments. Homebrew or Linux distros give you only one fixed version of each package - typically fine for an OS, CLI/GUI tools, etc., but not for Python development work.

So, please think of it something like a portable Linux distro plus pip-like multi-version support plus pyenv-like capabilities (which can be applied to both Python and distro packages). For more detail on that, see Build & package management concepts and terminology - pypackaging-native.

The comparisons to apt and “if it works for Debian, it should work for Conda” that came up a couple of times recently tends to be incomplete, so it’s not just an academic point. ↩︎

pf_moore · February 13, 2023, 5:29pm

First of all, I think this is a really important discussion, so thanks for commenting. I appreciate the link, and I do see how conda is something of a special case in that classification, but I think the important point is to have an intuition that casual users can relate to which lets them discuss solutions without always needing to special-case conda - because doing so just enforces the divide we currently have, and entrenches unhelpful views.

The recent discussion on making pip interact comfortably with conda is a case in point. As an outsider, I don’t really understand what’s unsuitable about the existing solutions by which conda can say “hands off my stuff”. I understand that they were designed with Unix distributions in mind - although as I understand it that was mainly because only the Unix distributions really participared in the discussions, not because we were trying to limit the scope. And if they don’t work for conda, that’s a shame. But why don’t they work? And how, short of us all becoming more expert in conda, or somehow trying to get more conda experts to participate in such discussions, do we catch such problems in advance? Because as I said on that thread, we already have multiple ways of ensuring pip interacts cleanly with “other” package managers, and it’s a bit unlikely we’ll add yet another one, specifically for conda. So we’re a bit stuck, and I’d like (as PEP-delegate) to have better ways to sanity-check proposals for this sort of issue without everything being gated behind “does conda say it’s OK?”

At the moment, the only two “broad viewpoints” I can see are:

conda is sort-of like pip, but it also manages non-pip dependencies
conda is sort-of like apt/dnf, but it has environments so you can have different versions of stuff installed (and its repositories therefore have multiple versions, “just like PyPI does”)

Neither is the full story, but which is best for a non-expert? Or if neither, what is a usable analogy?

Edit: I quoted your “portable Linux plus…” comment and then forgot to make the specific point I wanted to. The problem with that characterisation is that it gives me nothing to work with in terms of “assumed behaviours”. On the contrary, it’s essentially saying “but it’s different” for every analogy I try to construct. So it’s actively blocking me from having any useful insight about “how would this proposal affect conda?”

steve.dower · February 13, 2023, 6:03pm

I think the best way to characterise conda to be helpful for packaging (or possibly just Paul ) is that each Conda environment is like a complete install of Python on Windows:

independent of any “system” install
has its own Lib\site-packages
has its own python.exe (a real one, not a redirector)
doesn’t rely on any other installs or preexisting environments
includes its own native dependencies (Tcl/Tk, OpenSSL, LibFFI, to name those that the CPython installer includes)

The main difference is that (from pip’s POV) it may come with packages already installed in site-packages that pip didn’t put there. And the biggest complexity (unlike, say, WinPython, which does exactly the same thing) is that Conda might put more packages there later, even after its initial install.

Everything else is really just the mechanics to make stuff work in a cross-platform manner. Even Windows would likely need a command-line tool if the python.org distro supported installing the same version multiple times (which it currently doesn’t). And the fact that it uses its own package format isn’t really any different than using MSI packages.

So the conflict arises because Conda and pip are both trying to manage site-packages post-“creation”, and a secondary problem is that many users don’t know how (or aren’t aware that they need to) choose which tool to use for a particular task.

Fundamentally there’s no problem here, except users have expectations that are only met when they use conda (specifically, “my packages will be fully compatible”), and expectations that are only met when they use pip (“I can install the latest releases of any package that exists”), and the path to handle both is really complex (e.g. upgrade an existing package? or constrain new installs to what’s already there?).

These are the same concerns you’d have with any distro that preinstalls packages,^[1] it just hasn’t really come up for anything else, probably because those tend not to be manageable anyway.

It’s not totally unlike the discussion about whether venv should use ensurepip to install the version of pip that was part of the distribution, or should reach out to the internet to get the latest. But I’m thinking more of something like WinPython. ↩︎