Yeah, Rust is a compiled language like C and C++, not an interpreted language, so the compiled program is executable on its own and has no dependency on the presence of a runtime environment.
Disgree with that — PyPI existed first, and was the place to upload sdists and other formats.
easy_install was created as a client, then pip replaced it. conda was created after that with different goals.
PyPI really is the standard service used by Python programmers to host installable packages.
I don’t want us to go too off piste here, because this is about Posy, not Python vs Conda, but I recognise the question is relevant.
What you are essentially saying is that there is no technical difference between conda and this new posy; it’s just that posy integrates with a system that has a certain social status (PyPI).
I don’t think that’s the case. Conda deals with far more than this proposal, including virtual packages (for the system), and other binaries.
I think in general we’re all guilty of trying to overly-reduce the problem into its “eigenvectors” and then find a solution. That invariably makes everything far harder, because there are so many competing needs and interests.
Yes, Posy (as I understand it) overlaps with some of Conda, but ultimately we are already of the consensus that whatever Python does to move this problem forward, it won’t be directly adopting Conda. At least, that’s been my read of the messaging. So, any incremental progress towards the bootstrapping problem & the context overhead required for new users to learn Python is a benefit.
Not true. For a start, PyPI is under the PSF umbrella and its interfaces are defined by PEPs, so it is the official package repository in far more ways than just perception. And as a technical difference, files on PyPI are provided by the package author, so they are as up to date as it’s possible to get (short of building your own). Conda builds are (at least sometimes) done by a 3rd party and as such can “lag”. I’ve seen this happen in reality, so it’s not a theoretical concern for me.
It’s entirely a personal question how important this is, but claiming that there’s “no technical difference” is wrong - even if the difference doesn’t matter to you, it’s there.
But I don’t 100% agree with that. (Note this is a personal opinion, not a technical argument in either direction). For me, conda shows that if that model ties you into having to wait for the external distributor to publish “their version” of Python, that’s a frustrating limitation. Yes, I acknowledge that’s only important to “early adopters” like me, but as I say, this is a personal view. One further advantage of posy (again, maybe only to me) is that the Windows and MacOS builds it uses are repacked versions of (essentially) the python.org distributions. The linux build is a custom build, but there are no python.org builds for Linux, so I view that as inevitable.
So yes, conda and posy are similar models, but they make very different trade-offs. And I don’t think the value of the idea is independent of those trade-offs.
Finally, I’ll point out that if this model is successful, it will only become the “unified solution” that everyone is talking about if it gets distributed from python.org, as the official distribution of Python, and it uses PyPI (the official package repository), as the source of its packages. As far as I’m aware, conda is never going to be that. I don’t know if posy will, but I view it as an experiment into whether that is a possibility.
The Windows build comes from the nuget distribution, but that’s “official” in the sense that it’s in the docs and it’s released by the CPython Windows release team. ↩︎
I understand what you’re saying, but I think you’re missing my point. (Or, perhaps we disagree about the value of the idea independent of those tradeoffs.) I’m not talking about conda vs. posy. I’m talking about what I said: “It would be a good idea if the default way that people think about Python is that you do not install Python; instead you install a program that manages Python (along with managing libraries used by it)”. And the tradeoff you describe about external distributors may be a tradeoff that conda makes, but it’s not a tradeoff that inherently has to be made in order to adopt that idea, because. . .
Precisely! In other words, the way (or a way) to apply the idea I said without the tradeoff you describe is for the official Python.org distributions to shift to a “manager-first” model. Then there is no waiting for any external distributor to do anything.
Great initiative! Looks very promising @njs I really like that you’ve taken a holistic view of a larger scope of the problem, but not too large to make it impractical to solve.
Fwiw, I think it’s probably best to keep the discussion focussed on pybi and posy. The project is new, it’s explicitly stated in the OP that external pythons (such as a conda python) could be explored, but that it’s not a focus to get started. Let’s celebrate the project, not derail it into the same circular discussions we continue to have. conda is solving a different (bigger) problem than only python packaging. It’s really a better comparison to compare conda to nixpkgs (ie they are both “universal” package managers) than it is to compare it to any tool that only manages Python interpreter distributions and Python package distributions.
Indeed. There is a “bootstrapping” issue I can think of, though: Rust relies on LLVM, so it doesn’t support some of the old/exotic architectures that CPython currently supports (cf. the now infamous Dependency on rust removes support for a number of platforms · Issue #5771 · pyca/cryptography · GitHub).
OTOH, CPython already doesn’t provide binaries for Linux, so I guess this is just fine and people should just compile Python themselves on such platforms (where “people” includes the distro packagers)?
This new tool sounds fantastic
I didn’t want to comment on this aspect to not derail the thread into another pip/PyPI-vs-conda discussion, but since that happened anyway, I just wanted to note the following:
If I read the
Pybi-Path: in the posy-spec correctly, this goes like 90% of the way towards having all the necessary pieces to do what conda does (i.e. putting relocatable packages into a common place that has
I’m not saying this was @njs’s goal, but with posy + a PEP704-alike, PyPI could realistically start distributing binary artefacts rather than vendoring them into every wheel (this would need some metadata handling, obviously, but that’s minor in comparison).
In other words, it’s by far the most concrete path I’ve seen so far for eventually having conda come back in the fold (e.g. initially posy keeps installing wheels as usual, but conda could conceptually install into the same prefix, and over time, wheels containing native code could start distributing their underlying libraries separately).
should be: conda was created to solve very painful short-comings in the python packaging model at the time. If those short-comings are solved, there’s no really good reason to keep the schism.
I’m really glad to see some innovation in this direction . Currently working with and distributing python code can be really convoluted and confusing (not only for new users). This is both due to the plethora of tools (that each only do part of what is needed) and the broad spectrum of things python can be used for. My team uses python for
- backend applications (run as containers)
- internal tools, mostly for DevOps and automation
- data science and scientific computing
The tools we currently use are:
pyenvto install python and to create some long lived, manually activated python environments for quick and dirty experimentation
pdmto manage python projects (libraries and applications)
pipxto install some python apps we use (
copieretc.). Some of these require the use of
pipx embed <some_package>to install additional packages alongside the application e.g. jinja2 extensions for
pyoxidizerto distribute internal applications to non-developers
requirements.txtwith hashes in docker containers
anacondafor some data science projects that were just easier to get started with
tensorflowetc. ). Also
Azure Machine LearningIIRC uses
condabehind the scenes and exposes this via the SDK to the user.
Problems we have encountered along the way:
- getting new developers (e.g. interns, students or apprentices) started. Setting up python to write their first library or application can be a real challenge. Even if you use
pyenvto install python, how do you install
pipxmaybe? How do you install
pipx? This feels like going in circles. Surprisingly it is far easier in my experience to get people without any prior python knowledge started. People that had some prior exposure to a mix of
virtualenvand horrible PATH manipulation are even more difficult to get started. In addition to just showing them (what we believe) is a sane way to setup your workspace you have to actively unlearn bad habits with them.
- distributing applications to non-developers. Same as the above but even worse. Using
pyenvto install python and then
pipxto install an application is already too much for most non-developers. We have had mixed success with
pyoxidizermostly due to dependencies that didn’t want to work properly when bundled this way…
- switching between “regular” python and
condaespecially if you only use
condato create new environments every once in a while.
- dependency conflicts when developing an application, where two dependencies couldn’t be installed due to (unnecessary) upper version bounds on their dependencies. Haven’t had this problem since we switched from
pdmbut what I remember is that you basically had to either wait for the maintainers to release a new version with updated dependencies or (temporarily) fork the dependency and bump the sub-dependency yourself. As an application developer I’d really appreciate an escape hatch that let me override whatever the dependency solver thought was correct and just let me use the version that I tell it to.
I think that’s only because (for now) posy isn’t provided as a (per-platform) binary. If you compile it before distribution, you’d no longer need the Rust env?
I think some of the ideas here are interesting, though there are a few things that I’m not particularly enthused about:
- The project is written in Rust. I personally enjoy Rust, but I think it speaks to a serious shortcoming in the idea that it relies on being written in an external language to make it viable. There’s nothing inherently Rust shaped about this project, it’s only using Rust because Rust’s packaging story can provide a workflow for projects that are targeting end user applications, instead of developer tooling.
- The fact that everything starts from an invocation of
posymeans that your deployment tooling has to speak posy or it needs to give you the ability to execute an arbitrary.
- Overall, the output of posy seems like it would be fine for hacking away at something on a developer machine, but the moment you want to deploy that code somewhere else, the model breaks down.
- PyBI seems interesting, but it feels like the wrong direction to take things? It seems particularly silly to have two binary artifact types, one for Python things and one for Python itself. I understand that it may have been easier to just add an extra artifact type so you can just continue to support wheels, but PyBI feels like a weird hack.
- Does the
sitecustomizetrick leak into virtual environments created by the posy environment? I think it will, but I’m not sure offhand.
Au contraire, having an installer that’s not dependent on the underlying python, which can thus change that python install independently, solves a large class of problems that were so far unsolvable. I remember halfway-serious jokes about rewriting pip in rust popping up regularly over the last few years (and that from people deeply involved in Python packaging), for very similar reasons.
While the approach doesn’t crucially depend on Rust, it does makes a lot of sense for speed, stability, depth of ecosystem, etc., and well, the results kinda speak for themselves:
So while there’s a lot to unpack here (and many aspects that deserve in-depth discussion), the language choice being a serious shortcoming is just like… your opinion, man.
which would i.a. allow a future where python does not need to bundle pip, because the relationship is reversed: first download the installer tool (and only that), then let it handle your python version and packages, including most lifetime workflows like upgrading the underlying python version. ↩︎
who’d want to write / maintain something like that in C/C++, and get a free bouquet of CVEs in their package installer?! Java flavours are not very interesting because then you’ve just replaced requiring one runtime with another. And there’s not a lot of languages remaining that would be a serious contender after that. ↩︎
Interesting approach. This corresponds with similar work:
- Node.js: nvm-windows; the Go rewrite of nvm;
Some of your critics point to the irrelevance of Rust to Python. I understand your argument of choosing a cross-platform Python-independent system. Your implementation is also decently well thought out, and contains many features I hadn’t considered for my own solutions.
Not to be that guy, but are you married to Rust for this project? - Have you considered developing—or at least exposing an interface to—C?
That would match the language of the official Python implementation, and enable the greatest level of cross-platform cross-language distribution.
Not to toot my own [very unfinished] project, but I was working on a [header-only C89] library specifically for building
rvm style package managers: GitHub - offscale/libacquire: The core for your package manager, minus the dependency graph components. Features: download, verify, and extract.
The idea is to support all the common SSL, checksumming, HTTPS, unarchiving interfaces. That way, depending on build flags, security flaws in your HTTPS | SSL implementation can be resolved through Windows Update or an
sudo apt upgrade.
In Rust; when I last looked; Cargo couldn’t inherit flags from dependent crates, so your crate
crate0 can’t depend on dep0→dep1→dep2 and specify at build time that dep2 should be built with the OpenSSL variant. This lack of flexibility opens up your solution to certain attacks, and your dependence on Rust increases the chance of requiring frequent updates.
Whereas a package manager written in 30-year old C [optionally] dependent on system libraries could—once testing and assuming AoT spec is sufficient—never require an update. Sure, you could update to a new version and get HTTP/3 or whatever, but the old version should be usable for a decade or more into the future as its only job is managing the [Python binary] archives of others.
Sorry, but that’s not true.
The only property of Rust that is being used here is that it’s possible to run a rust binary on a computer that doesn’t already have Python installed on it. Which of course isn’t a property of Rust but compiled languages in general.
However, claiming that these problems are unsolvable doesn’t mesh with the reality today. For instance, posy could have just as easily been written in Python, and used one of the various strategies that exist to create a single file executable out of Python. Something like pyoxidizer, which even uses Rust itself to bootstrap the Python interpreter.
The language choice is a short coming, because it has the implication that the packaging tool isn’t capable enough to produce real world software that is meant to be deployed to end user machines, machines that you can’t rely on the system Python on. After all, there’s nothing inherently special about posy here, it’s just an application that wants to run without the dependence of an existing Python install.
Mostly I just think it’s hard for a single tool to provide a great UX for installing standalone tools for someone who doesn’t care about Python, and also a great UX for working on Python code, at the same time. You’re right though that a lot of the underlying machinery could be shared. It’s not my focus currently, but if someone wants to figure that out then it’s cool with me :-).
Sure, I get that.
For the Pybi format: I’m not worried about this at all, because there are already literally dozens of ways that people get Python. Python.org download, every Linux package manager, Windows Store, built-in to macOS, homebrew, conda, docker registries, nuget, pyenv and its competitors, Heroku buildpack, preinstalled on GitHub Actions’ VMs, self-built, … and it’s fine. I don’t think adding another option is going to cause any problems, and actually a bunch of those could be simplified by having pybis on pypi.
For workflow management tools: yeah, this is a more direct competitor to those, and the confusion in the community is real. But like, the approach/architecture is different enough that idk how you’d patch it into any of those tools, and this is the tool I want to use so… not sure how to avoid it, really.
I am less worried about the proliferation of workflow tools than some. It’s definitely painful right now, but this is just kind of the natural lifecycle of these things, as people explore the different design tradeoffs and figure out what works? Back in the 2000s, the Python community spent years agonizing and soulsearching over why we had so many web frameworks, and how we needed to get people to stop writing them or else we’d never be able to compete with Ruby on Rails. But telling people to stop writing the frameworks they wanted to use never really accomplished much, and in the end worries just kinda of faded out as mindshare coalesced around a handful of options like Django. A similar thing happened with version control systems around the same time.
I don’t want to derail the thread into conda/pypi discussions, but there’s a common misconception here I want to address.
The packaging ecosystem of PyPI/wheels/sdists/setuptools/etc. is really more of a “meta-packaging” system. Python gets used in tons of different contexts, and packages people upload are PyPI are generally useful in multiple contexts – so e.g. when I upload a Trio release to PyPI, I assume it might later end up in a Debian package, internal tooling inside macOS, a RenPy game, a conda package, a Blender plugin, Google’s monorepo, a Nix snapshot, etc etc. All of these environments make specific choices about how their Python environments work, how to manage dependencies, so on – and the PyPI/wheel/etc. ecosystem is designed to be flexible enough that all these different downstream ecosystems can consume our packages and adapt them to their situation. If I release a conda package, then only conda can use my code; if I release a PyPI package, everyone can use it (including conda).
So I’m going to upload Trio to PyPI. But that means Trio needs to use PyPI-style metadata, and in particular PyPI-style dependency declarations for the packages I want to use (since this is the dependency metadata that all those downstream ecosystems know how to consume and map to their own framework). And that means I need a way to take my PyPI dependencies, and run my tests against them, which means I need a way to build Python environments out of PyPI packages. (And also I really want direct access to the packages that other people are uploading to PyPI, so I can do stuff like push a feature to one package, and then immediately use that feature in another package. The downstream ecosystems will catch up on their own time, but first I need to do the work they want to catch up with!)
So for lots of end-users, conda and PyPI are more-or-less interchangeable – they can pick whichever they prefer based on whatever tradeoffs are important to them. But package maintainers need to work with PyPI, and since they’re the foundation of our whole ecosystem, they deserve good tools!
Currently Posy only supports Windows/macOS/Linux, so not an issue :-). Or more precisely: it only supports platforms that have standardized wheel tags. So if FreeBSD or whoever wants support, there’s a clear way. But from my experience with wheel tag standardization, I doubt it’s going to get ahead of LLVM/Rust.
I mean… you absolutely could make it work in Python. Conda manages it. But
When I started, pip was still totally wedded to running inside the environment it was managing
A tool whose most important functionality is:
- Running before every Python execution, so startup speed is absolutely critical
- Chewing through exponential-time resolution algorithms
- Moving bulk data around
… is just never going to be Python’s sweet spot. Horses for courses and all that.
Then I wouldn’t have an excuse to muck about with Rust, so it wouldn’t exist
Sure, but that’s just out of scope? Use posy to invoke your favorite deployment tool. Or if your deployment procedure is “take your dev environment and stick it in a tarball/docker container”, then posy could do that pretty trivially – all you need to do is unpack the same pybi/wheels into an alternative (simpler!) filesystem layout.
The pybi format is closely modelled on wheels; in posy right now they actually share a lot of code. (There’s even a
BinaryArtifact trait to write code that’s generic over both.)
The difference is just, pybi’s have metadata that tells you have to install wheels into them; and wheels have metadata that tells you how to install them into a Python environment. (Remember that one of the mandatory inputs to the wheel unpacking operation is the paths to the Python environment’s purelib/platlib/headers/scripts/data directories… that doesn’t make sense if the thing you’re unpacking is a Python environment!)
Oh and pybi filenames just have platform tags like
win32, not full wheel tags like
cp310-abi3-win32, for obvious reasons.
I guess we could declare pybis to be a special variant of wheels instead if we wanted, but I think it would be more confusing than helpful?
Honestly, I have no idea, I’ve spent zero thought on what happens if you enter a posy environment and then try to make a venv from it :-). I’m not sure what you’d even want to happen? It’s a weird thing to do.
You can certainly use a pybi as the basis for venvs, though – they’re fully-capable Python installs.
I’m not going to rewrite it in C, no And FWIW the download/verify/extract part is pretty small compared to all the other stuff posy does currently, like parsing package metadata and resolving dependencies.
Is your concern about rust being a bad choice technically, or about having bad optics/marketing?
This doesn’t feel particularly relevant, given you’re not using pip from posy anyways?
Eh, I’m not sure that I agree, and it’s a bit weird to me to be worried about the performance of startup speed, then create env forest construction that has to happen on every invocation, which AIUI is going to explode the length of
sys.path, which is going to create the same perception of slow commands on it’s own.
Should it be out of scope? If we’re trying to step back from the Elephant, shouldn’t we be looking at the total scope? It feels like the up front idea, or at least the genesis of the idea was how the various packaging tools seeing “one part of the problem” instead of the whole problem. Declaring that out of scope feels like it’s just solving “two parts of the problem” instead.
What I was thinking of to be honest is that there’s no reason that Wheels have to be quite so specialized to installing only Python packages, and could be evolved such that the same packaging format is useful both for installing Python packages, and not Python packages. After all, there’s nothing inherently different about a pybi and a wheel, they’re both just zip files with some files in them that get unpacked onto the filesystem.
It doesn’t feel like a weird thing to do? A number of tools in Python assume the ability to create virtual environments as part of their workflow. Heck you just told me that in posy, I should use posy to run my deployment tool, what if that uses a virtual environment to create isolated build environments for instance?
It’s not exactly technical, but I’m not sure that optics/marketing is the right word.
I mentioned it above, but one of the initial ideas you mentioned in your post was that all of the existing packaging tools solve one piece of the problem, which means that there’s no coherent answer to an overall workflow. What I see with posy is that it is repeating that same mistake, except maybe it’s solving two pieces of the problem, instead of one.
So while Rust itself is a perfectly fine language to write a tool in, one of the things it’s achieving is papering over the fact that posy still is making the same fundamental mistake that the other tools do, it’s just maybe done a nicer job at it. That papering over makes it easier to ignore the fact that it’s still missing important parts of the packaging story, which I think is a serious drawback to the approach.
But of course, there’s nothing actively preventing you from writing something that does actually the whole problem, and not just pieces of the problem, in Rust or any other non Python language.
There is also of course an optics side of things, by having tools that work on Python packaging written in something that isn’t Python, it’s an implicit statement that Python isn’t an acceptable language for CLI programs targeting end users.
Maybe we’re looking at different part of another elephant, because I agree with the quoted paragraph. Within compiled languages, Rust stands out for several reasons though, which I already mentioned in a footnote, but here again:
The list goes on of course:
- having a native package manager (non-existent in C/C++), so plugging in third-party code like pubgrub is trivial
- having no garbage collector and getting as close-to-the-metal as C/C++ without the horrible string and memory handling
So again, I acknowledge that the same thing could conceivably be done in another compiled language, but Rust is a very good choice from my POV.
What are you referring to here by “the packaging tool”?
Even aside the start-up question, the other two points are nowhere near Python’s strengths. And it’s completely fine IMO if parts of the Python packaging story need things that aren’t Python’s strength (as does Python itself…), but then we shouldn’t shoehorn Python into filling roles it’s just not very good at.
E.g. conda’s resolver is written in Python, and very frequently criticized for its speed. Mamba is faster because it uses a resolver written in C, but even that one is far behind (in terms of UX, e.g. error messages, or maintainability) compared to pubgrub.
I think the “same fundamental mistake” and “papering over” are both
 – the OP sets a clear scope, and admits the tool is not finished, but it does break new and very interesting ground.
I think we’re talking past one another here. You don’t need to write a program in a compiled language to have an artifact that you ship to end users that doesn’t depend on an already installed Python. Saying or implying that you do is simply incorrect.
It’s funny that you’re calling out pubgrub for maintainability and error messages, when the pubgrub library in rust hasn’t had a release in almost 2 years or a commit to their dev or release branch in almost a year, and which I think can’t even fully implement PEP 440 version specifiers.
This statement doesn’t make any sense. Whether or not you set a clear scope doesn’t impact whether or not that scope solves the whole problem or not. The original distutils-sig post that Nathaniel made years ago and referenced again spoke to the fact that part of the problem with the slew of existing tools was that they defined their scope too narrowly, and in doing that failed to make a complete, coherent solution from beginner to advanced.
My assertion is that packaging things for distribution to end users is also part of the packaging story, because well it is, and it’s one of the most chronically underserved parts of our packaging story, and if you want to design “the whole elephant” as it were, you have to also design for that.
I certainly agree that posy has a lot of new, very interesting aspects to it, and by tackling a larger share of the problem (albeit not the entire problem), it has the ability to create a more coherent solution. I just think think it’s stopped short.
Sure, you can create an executable out of pretty much anything. But (aside from performance & binary footprint), then we’re back to suitability of a given tool for a given problem.
Now I understand your original statement even less. What do you mean by posy “not being capable enough to produce real world software”?
Looking at the tracker, there’s issues being responded to and PRs being worked on, which is a start. Granted, resolvers are not particularly popular anywhere. I know that people are working on that for the mamba side of things; it also hasn’t stopped ruby of migrating there ~3 months ago.
I agree that packaging for distribution is part of the story (and triply so about it being underserved), just not that it’s necessarily unsolvable with the bits and pieces that we now have in front of us.
How about taking into account the new ideas and good bits this brings to the table in the wider discussion happening already, rather than declaring it dead on arrival?
and people get afraid to touch anything about them once things are running, for fear of regressions. ↩︎
Oh, I just meant, one obvious reason to use Python would be to reuse code from pip etc., but if I’m not doing that then it removes that reason.
I’m not sure if that’s true – it’s just one
readdir per package? But in any case since the environment mechanism is an implementation detail; we can always change it later. (E.g., inject some code into the import system so it “just knows” where to find all the packages without having to scan for them.) And anyway, if we wrote the same thing in Python then we’d have whatever penalty from
sys.path plus the overhead from the tool startup :-).
I guess for me, the elephant metaphor is more about target audience: each person only sees part of the elephant because each person sees their own experience. If everyone who has the problem of running their code in a dev env can use the same tool, then that’s a big win – and you can’t get there without being beginner friendly, without support for multiple environments and python versions, etc.
But no single tool can handle every single aspect of developing and distributing python code. My guess is that users will be pretty happy if we can tell them “here’s the way you install and run tools, if you want a tool for formatting use black, if you want to lint use pyflake, if you want to create a standalone redistributable app then use pyinstaller or pyoxidizer, … but no matter what you’re doing, you can install and invoke them and share them with your collaborators through the same consistent framework”. I could be wrong though, idk.
Yeah, it’s not super active, and I don’t think they have any real production users yet, so there’s some inevitable immaturity. It worked great for MVP purposes, but I expect we’ll either need to work with them to get some tweaks upstream or end up forking it so we can customize it more. The underlying pubgrub algorithm is really nice though, and having an implementation to start from likewise.
The only issue I know of with PEP 440 specifiers is the thing where if
bar 1.0 depends on
foo >= 1.0a1, then we’re supposed to automatically toggle on
--allow-pre foo, if and only if
bar 1.0 ends up in the final set. This is an inherent limitation of the pubgrub algorithm, because what makes the algorithm fast and able to explain its decisions is that it builds up higher-level inferences about your dependency graph as it goes. But if you’ve already explored possible versions of
foo before you discover the
bar 1.0 dependency, then you might have already learned that it’s impossible for
foo 1.0a1 to be part of your final solution and made further inferences based on it, but then discovering
bar 1.0 invalidates those…
So far posy just says, if you want
foo 1.0a1 to be considered then do the equivalent of
--allow-pre foo (and we can hand-wavily justify it by saying hey, you probably want to know if one of your transitive dependencies is pulling in some alpha release). idk if that will be good enough or not though; just have to wait and see I guess!