Thinking about Pip plans to introduce an alternative (zipapp) deployment method and how that may lead to Provide a `py run`/`py z` sub-command that searches for `.pyz` files · Discussion #221 · brettcannon/python-launcher · GitHub / Provide git-style sub-command support · Discussion #222 · brettcannon/python-launcher · GitHub, I was wondering if there was any appetite/interest in letting projects ship .pyz
files on PyPI? I’m not sure if it’s enough of a benefit compared to pipx install
/pipx run
, but it would be one way for projects that are meant to be used as tools to distribute themselves in a way that is discoverable and self-contained (when there are no extension module dependencies). Otherwise how many times do we all need to run pip install black
and burn the CPU cycles in resolving the dependencies?
Could this help programs meant for people who are not Python developers? (meaning not familiar with venv/pip/PATH, just need to download and install a tool or app for some task that happens to be written in Python)
IMO the hard part for end users is getting Python itself set up, not the tools themselves.
@ofek Sure, but also users may find it difficult or not want to manage multiple environments.
Zipped packages help since they don’t install/corrupt the user’s environment.
This is especially useful for users who only has system Python and doesn’t want to set up a separate one. However, this use case breaks down when the zipped packages depends on a package that needs extensions (most notably numpy) as Brett mentioned
I’m not sure how this would help. Would you expect .pyz
files to show up in the simple index? Would you want to standardise filename format (to allow auto-updating)? Do you imagine pip (or some other tool, which would be my preference) supporting installing/upgrading .pyz
files?
If all this would offer is that people could go to a project’s “files” page in PyPI and manually download a .pyz
file, I’m not sure it’s worth it.
Weren’t you the person arguing in the past that people want black and similar tools in their project venv, so they can pin the version per project? FWIW, I agree that people should install development tools like black, flake8, mypy, tox, … centrally. My go-to approach for that is pipx.
I’d argue that for most use cases, pipx
is a better solution. Shipping a zipapp is IMO a fairly specialised approach, and should only be the choice when the trade-offs have been considered.
Don’t .pyz
files need to bundle all their dependencies? Package installers like pip can’t avoid doing that, but for anything else it’s an antipattern.
In other words, if you’re already able to install from PyPI, .pyz
isn’t much of a win.
I like zipapps, I have used them quite a lot. Having zipapps/.pyz
on PyPI feels like it would improve in principle the story of applications distribution (as opposed to libraries).
Some random thoughts:
-
pipx seems to have taken a prominent and growing role in this story (handling of applications). So: if something is done, it has to work well with pipx.
-
In any case (pipx or not, zipapp or not), it seems like having some kind of lock-file is a necessary preliminary step, without a lock-file format it feels to me like it would be an incomplete story. The content of a zipapp has to be defined by some kind of a lock-file format, and I bet pipx would be happy to consume such lock-files as well. (I know it is being worked on.)
-
There is the issue of having a working Python interpreter in the first place, but I feel like things are getting better: Python launcher(s), Python in the Microsoft Store, ongoing work to fix the sysconfig installation prefixes on Linux, (not sure about this one but maybe) Linux packagers are getting better at packaging Python (broken Python on Debian/Ubuntu for example).
-
What is the right way to build a zipapp? What are the rules a zipapp must follow to be allowed to be uploaded on PyPI? Is shiv the right way, or is it pex? Should there be one and only one zipapp that works on all environments (combination of Python minor version, OS, CPU bitness, ABI), or should it be more like wheels and their tags?
I guess what I want to say is that it is a good idea for the distribution of Python applications but only if there is a complete story (from user’s and developer’s point of view).
And it seems to me like there is definitely a demand in the community for such a thing. For example, I have seen countless times the question of how to build a wheel with pinned dependencies (which is quite a misguided question in my opinion, but it reveals a clear intention).
I’ll add to this list (which I agree with, in general!) having a standard, supported way to make a zipapp into an exe on Windows. Struggling with PATHEXT
and knowing how to handle the edge cases where .pyz
files are not supported by the OS APIs is a big barrier here. Having a standard launcher that can be prepended (via a tool) to a .pyz
file to make an .exe
would be a significant help. (All of the pieces are available, I’ve even written versions of many of them myself, but the key is having a standard, well-known toolset.
However, even having said this, I still think that zipapps will remain niche. Lack of support for binary extensions[1] is a big issue, and lack of tools to manage installation/upgrades is another.
-
Shiv partially covers this via auto-unpacking, but that has its own issues with housekeeping the unpacked files. ↩︎
Publishing to PyPI doesn’t necessarily mean installable by pip (unless someone made an arbitrary rule about it, in which case we can modify the rule ).
For example, you could publish black
as a wheel with loose dependencies (for pip), and also as a pyz
with tested and locked dependencies (for pipx). Or make .pyz
the extension for “it’s a wheel but you can run it directly”, which wheel doesn’t officially support even though it kind of works.
I’m pretty sure the existing wheel spec would work for it, though we’d likely have to define a default entry point or something (possibly also user-friendly GUI elements). Tools (e.g. pipx) that prefer to work with .pyz
can choose to extract or use directly or add an executable stub or add a Start Menu item or generate a complete installer or whatever they need to do in order to make it work.
So I think there’s value in a variant of a wheel that:
- (optionally) pins its dependencies (but we’d recommend this)
- always has a default/recommended user entry point (or maybe a new app-like set?[1])
- provides user-friendly GUI elements (icon, display name, support URLs)
- is available from a PSF controlled repository
-
Prior art here are all the phone app models, which have system-defined entry points for things like background and maintenance tasks, deep linking from other apps, etc. We could also define “on install” or “on first use” and “on uninstall” entry points. ↩︎
Sorry if I gave the impression that I thought it did. What I mean is that publishing to PyPI doesn’t seem to gain much unless we add infrastructure and tooling to consume such zipapps. Expecting people to manually download them isn’t very helpful.
Yes, someone could write that tooling (you cover the infrastructure in your comments). But I’m not sure that the benefit would be sufficient to justify doing so. In spite of tools like the stdlib zipapp module, and 3rd party tools like shiv, zipapps have remained stubbornly low-key, never really taking off as a distribution format.
Maybe all that’s needed is the tools and infrastructure, and we’ll suddenly see a huge increase in demand. Honestly, I rather hope that’s the case. But we should probably work from known use cases if we plan on putting in that sort of effort. And I didn’t get the impression from @brettcannon’s post that he was thinking of something that ambitious. But I may be wrong.
I don’t think they need to, but they can.
I double-checked the zipapp module in the stdlib and its docs: a zipapp is a zipped directory (including a main module) with a shebang. It could be a few modules that depend on the stdlib, or that depend on external packages being installed, or that includes its dependencies by following the pip recipe in the docs.
This is right. They can bundle as much as any other directory that ends up on sys.path
I like zipapps, and we use them extensively at work (in fact, one of my colleagues wrote shiv). I also don’t really see much value in putting zipapps in PyPI, and pipx
pretty much fills that niche for me.
The downside of zipapps, as others have pointed out is that you need a working Python interpreter to be preinstalled. Maybe that’s getting easier, maybe there are still glitches in that workflow. What I’d really like is a better self-contained app format a la PyOxidizer.
Yes.
Probably.
Some other tool.
Yes, but I also have enough people getting after me for that suggestion that I know it’s somewhat of a losing battle. Enough people I know do pipx run black
or the equivalent at this point that I’m letting it go and suggesting it’s an option to install locally in the environment if you need a specific version of Black. (In VS Code we are actually just shipping the latest release of a tool in the tool-specific extension to solve the installation issue and use the version installed if a setting is specified, but there are perf reasons for all of that).
I’ve also been known to change my mind.
Need is a strong word, but typically yes.
… until a library dependency the tool depends on updates and thus subsequently breaks you. Since this is meant for apps and not libraries, having a well-tested, locked-down distribution for the app may be useful. But perhaps apps could also distribute lock files when we finally get such a format and that would be another way to deal with version update breakages.
I don’t think that’s necessarily true if you view the .pyz
file itself almost like a lock file, but a lock file would definitely help.
It has to work when it’s passed to a Python interpreter of the appropriate version. The question of more metadata is an open question.
I wasn’t specifically, but ambitious projects haven’t stopped me before either. You can consider this a question to spark a conversation to see if there’s a direction we would all like to go that makes distributing self-contained apps more appealing.
But it seems .pyz
apps on their own aren’t enough for folks to want to makes them more of a thing on PyPI, which I totally understand.
My personal opinion is that bundling pipx
with the standard Python distribution would be a better way to solve this problem. This helps solve the bigger bootstrapping problem Python has without the need for users to upload a new format to PyPI.
Other language distributions do this:
- Node bundles
npx
by default - Rust bundles
cargo install
by default
This would also remove the need for projects like Poetry to publish their own out-of-band installer, as they could rely on all Python developers having access to pipx
.
Or, perhaps parts of pipx
could simply be upstreamed into pip
itself.
I don’t see that happening at CPython historically keeps packaging separate. It also makes that the “blessed” approach which will be its own argument as not everyone uses pipx, virtual environments, etc.
I think this is the key reason we keep coming round to the zipapp question. Without a “language standard” solution, the various individual solutions need a way to bootstrap themselves. And the only language feature that allows this is zipapps.
So I see zipapps as a great way for tools like pip that want to bootstrap themselves to do so[1]. And I’d definitely be interested in seeing the stdlib zipapp module gaining a means of building, discovering and managing zipapps from PyPI. But I don’t see that as meaning that zipapps are the “blessed” solution, simply that they are a starting point for the actual tools that we expect people to use.
I think it would be great if pipx provided a zipapp, for example. ↩︎
2 posts were split to a new topic: Creating a standalone CPython distribution
Having thought about this topic extensively as part of creating PyOxidizer and related technology, I have a lot of opinions on this topic. I apologize for the long post.
Let’s start by considering why zipapps aren’t more popular. Reasonable people can disagree on the comprehensiveness and relative priority of this list, but here are my reasons (in no particular order):
- a) They aren’t self-contained. You still have the problem of how to download, install, and run them. As others on this thread have already mentioned, at the very least you need some kind of Python distribution / app runner to make this sufficiently turnkey.
- b) There’s still a lot of cases where they just don’t work. (Notably extensions modules and custom shared library dependencies. Throw in some
__file__
issues as well.) - c) There’s no obvious distribution channel / you are competing with other distribution channels, like package managers and app stores.
Allowing zipapps on PyPI kinda solves c
. But if that’s all you do, I’m skeptical of the benefit. There is, however, a chance that adding zipapp support on PyPI effectively nudges zipapp into the limelight and creates a renewed interest in this distribution format. That might be enough to help popularize it and inspire others to build out requisite tooling and enhancements to the format to really make it shine. Considering how many people and tools have tilted at the zipapp windmill, I’m skeptical this occurs.
At a technical level, I find .pyz
/zipapp kind of underwhelming and deficient and part of me would be disappointed if it were blessed as an app distribution format on PyPI. Don’t get me wrong, the format is simple and elegant and can solve very real problems. It is a very practical and working solution for a lot of people and I mean no disrespect to its creators, maintainers, or users. But a general purpose app distribution format that scales to more demanding Python applications it is not. There’s a reason that some modern Python app distribution tools avoid using zipapp.
Significant limitations in zipapp include (again unsorted):
- d) Inability to handle many extension module use cases. The scenario where an extension module loads a non-extension module
.so
/.dll
/.dylib
comes to mind. - e) A lot of code in the wild relies on
__file__
. Until everyone uses APIs likeimportlib.resources
, there’s a lot of code that flat out won’t work due to__file__
dependence. - f) There’s a lot of performance left on the table. zlib isn’t environmentally defensible in 2022 (but the best suited compression format currently in the stdlib).
zipimporter
itself is also relatively slow compared to native code (although I haven’t benched with 3.11 yet). - g) The current format doesn’t really offer you any compelling features. More on this below.
To handle d
, there needs to be code somewhere (either at zipapp build time or run-time) that is able to peek inside ELF/Mach-O/PE/COFF data structures to tease out shared library dependencies so the run-time importer is able to make shared library dependencies loadable when importing an extension module that depends on them. And this will need to be supplemented with user-defined configs at build time because there are scenarios where an extension module calls dlopen()
(or equivalent). These manual library loads are impossible to infer via static analysis (though you could probably be tipped off to their existence by the presence of e.g. dlopen
in a symbol table). So no matter what you need some kind of metadata in the app descriptor conveying shared library dependencies. No such metadata format is formally defined today. Until it is, zipimporter
uses it, and .pyz
producing tools emit it, there is a significant barrier to adoption of zipapp for sufficiently complex Python applications.
e
should hopefully be resolved in time. But it requires working code utilizing __file__
to be ported to a different method for loading resource files. One idea to help lessen the pain would be to allow hooking io.open()
so path-like values could be routed to non-filesystem I/O. e.g. zipimporter
today does expose file as e.g. /path/to/app.zip/foo/bar.txt
. Obviously this isn’t a real filesystem path and open()
on it will fail. But if we extended the PEP 578 PyFile_SetOpenCodeHook()
+ io.open_code()
mechanism to all I/O, [meta] path importers like zipimporter
could define virtual filesystems and allow __file__
to work, even if actual filesystem files don’t exist.
f
matters if you care about distribution size or milliseconds at interpreter startup time. Startup time matters a lot for frequently executed [CLI] tools. (See my Python startup time post from 2018 where I measured that Mercurial’s test suite spends 10-30% of its wall time just launching Python interpreters, depending on where you measure from.)
g
is a nuanced take. We already established that zipapp by themselves aren’t useful: you need something to run them - maybe download them. What are zipapps? They are zip files consisting of Python resources (notably .py
/.pyc
files). What are wheels? They are zip files consisting of Python resources (notably .py
/.pyc
files). So, uh, what’s the difference between a zipapp and a wheel? The answer is… not much.
This leads me to my next point: zipapps today - being fundamentally a zip file of Python resources - are essentially the same thing as wheels. (Actually, they are more crude than wheels: wheels have well-defined metadata files, which zipapp lack.) It is even possible to run a wheel through zipimporter
! So part of me questions why we’re talking about shoring up .pyz
/zipapp support. Wouldn’t it be better to make wheels a viable replacement for .pyz
and have N-1 solutions?
On that note, I strongly agree with @steve.dower’s assessment that there’s value in a wheel variant optimized for defining applications. If this existed, tools (like my PyOxidizer project) could compete on the best method to install and build that application. Perhaps the tool used to materialize an application could be defined via a mechanism similar to pyproject.toml
’s [build-system]
. And a globally installed app launcher would be able to fetch the app wheel from PyPI, download the tool for turning it into an app (maybe the .pyz
/zipapp
tool is part of the stdlib), and produce a working application.
And since build systems are just glorified caching systems, maybe PyPI could some day operate a build farm to automatically turn uploaded app wheels into MSIs, DMGs, etc. This is a future for Python application distribution that would actually excite me. I find it infinitely more exciting than .pyz
/zipapp.
(While I’m here, I want to quickly plug my project oxidized_importer, a pure Rust meta path importer that has a partial (faster!) implementation of zipimporter
and a custom binary serialization format (python-packed-resources) that is conceptually similar to what .pyz
does except a bit faster and smarter. I mention this mainly so people are aware that there are attempts to move beyond zip files for Python resource distribution and importing. Thank you for continuing to not give zipapp
any special APIs that would prohibit people like me from experimenting with new and novel importing techniques!)
Another problem is that to use a zipapp, the consumer has to know that it’s a Python application. A native self-contained format means that the binary is “just” an application. The fact that it’s written in Python is no more important (theoretically) than if it were written in C, Rust, or whatever. Practically speaking, this eliminates the sometimes tricky step of getting a Python installed in the execution environment that is appropriate to run the zipapp.