Standalone app deployment story

Continuing the discussion from Structured, Exchangeable lock file format (requirements.txt 2.0?):

I wrote a summary of situation on my Mac, but Discourse crashes when I tried to submit, and now my Safari tab hangs whenever I open discuss.python.org :man_facepalming:

My conclusion was that pipsi and pipx are the most popular tools, but they are more workarounds than solutions, and donā€™t really solve the root problem of standalone apps being deployed like libraries.

Should we just start throwing efforts at Briefcase? GitHub - beeware/briefcase: Tools to support converting a Python project into a standalone native application.

1 Like

For me, I think the ultimate question would be

ā€œHow much effort would be involved in convincing the authors of <insert list of standalone tools here> to make this their official distribution method?ā€

Any solution that doesnā€™t get widely adopted is probably useless, IMO. Thereā€™s a definite snowball effect involved. We should also look at why tools like py2exe, cx_Freeze and pyinstaller arenā€™t more popular.

Agreed pipsi and pipx are workarounds. Managing the virtualenvs, particularly when you dump and reinstall your base Python as often as I do, make them a PITA in practice.

My personal list of tools that Iā€™d like to have available as standalone includes: black, flake8, invoke, devpi-server, pew, pipenv, pylint, pytest, rwt, sqlite-utils, tox, virtualenv, youtube-dl and cowsay (gotta have some fun ones in there :wink:) I have all of them bundled up using shiv at the moment, so (a) I know they work standalone, and (b) I know they are useful to have as standalone apps.

Iā€™ve heard talk about Briefcase, but never tried it (or an app bundled using it). So, maybe? Having to alter setup.py means itā€™s hard to experiment with it ā€œfrom the outsideā€ - you need to check out and modify the project if you want to build a test executable. And if I, as a 3rd party, can make my own standalone app just by saying something like make_standalone pipenv, then itā€™s a lot easier for early adopters to try out a tool like this and offer feedback.

But from a glance at the quickstart, it looks like the setup.py changes for a project are simple enough, so maybe Iā€™ll give it a try.

1 Like

Oh, one other thought. The converse of this is ā€œhow much effort would be involved in making this solution so popular/easy to use, that vendors of IDEs like VS Code and PyCharm donā€™t even consider expecting you to install your linter/formatter/whatever in your Python environment, but just assume youā€™ll have it as a standalone app?ā€

2 Likes

I donā€™t think PyPI distribution is going away in the foreseeable future (if ever at all) unless we outright ban it. It is, however, very possible to convince people to distribute standalones alongside with PyPI. People already request most of these project for alternative packaging (Homebrew is a popular one), so I think it is viable for maintainers to accept as long as we can get a moderate user base and a somewhat ā€œofficialā€ status.

Some ideas:

  • A central place to put packages on, like PyPI, and an ā€œofficialā€ way to install, like pip. Maintainers donā€™t want to host releases, and users want to be told exactly what to do.
  • It needs to be dead simple. No extra or very little extra configuration. For a setuptools project, just use whatever available from setup.py and make everything else optional.
  • Dead simple to release, like Twine to PyPI.

I donā€™t know exactly how much effort that would be, but assuming the packaging part can be worked out (that would be the most problematic, I imagine), tooling and infrastructures might be easier since there are already related works to take ideas from.

The big advantage of a standalone app deployment is properly integrating your dependencies. Things that I normally do in mine include flattening the directory hierarchy, zipping as many dependencies as possible, combining DLLs into a single directory, and replacing python.exe with my own (or adding mine alongside, for the sake of libraries that require starting new processes). Capturing this with existing tooling is really quite difficult. However, itā€™s also the best way to deal with DLL Hell (on Windows, maybe others?) as experienced by scipy et al.

Iā€™ve seen varying experiences with Briefcase. Some have been excellent, others have hit their own edge cases. I canā€™t decide whether itā€™s best to get behind it or to learn from it and push in a new direction (similarly for zipapp and the wild variety of packing tools).

Iā€™ve got a few teams at work that Iā€™m working with who need this, though, so I expect Iā€™ll come up with a concrete set of requirements sooner or later. It will be Windows-centric, unfortunately, but I suspect the required approaches will also have to be fairly platform specific and weā€™ll mainly want to align on configuration rather than implementation.

1 Like

What is the advantage of zipping dependencies? I do most of the things mentioned here (to deploy our products at work), except the zipping part.

This gets a bit tricky as that gets us into the binary executable distribution game which is historically the apt/Homebrew/etc. type realm. Iā€™m not saying we couldnā€™t do it specifically for Python apps, but it is a new arena.

Iā€™m starting to think that something along these lines and/or moving towards static compilation with C extensions + freezing is the way to go (but maybe the single binary solution is something to target after since we all know that compilation of some popular libraries is not exactly easy :wink: ). But coming up with tooling that would allow sending someone a zip where you say ā€œunzip this and then run/double-click this thingā€ would be fantastic and also help in the educational world where they have been dying for a way to have e.g. a fifth grader share his game with his friends easily.

Performance. Less files to decompress on install, files to index, which can matter on some file systems/platforms.

3 Likes

Nice! We are migrating all of our pexes to shivs. Zipapps are great in environments where you know you have the Python executable, stdlib, etc. and can install all your dependencies. But thereā€™s also a usecase for distribution of everything, including the interpreter, in a single file. There are snaps and xars, freezing of various flavors, etc. There will always be tradeoffs in size, complexity, duplication of dependencies (sort of the anti-distro problem). Iā€™m not sure any particular solution will be general enough to include with Python. There are just too many competing requirements for a single solution to win, IMHO.

2 Likes

The biggest problem with shivs is the usual one - not being ā€œproperā€ exes means that they donā€™t work seamlessly on Windows. See my essay here on why exes are the only real solution.

1 Like

I guess life is just easier on *nix :wink:

Say that again when youā€™re deciding what to call your Python interpreter in the shebang line so it works on both Python 2 and 3 :stuck_out_tongue:

2 Likes

Speaking of *nix, is there an established way to build a ā€œrelocatableā€ CPython like Windowsā€™s embedded distribution? Itā€™d be extremely helpful if weā€™re going the flat directory direction, but I have yet to find a definitive solution.

3 Likes

The embedded distribution is a great way to package standalone apps. Having it be cross-platform would be a huge help.

3 Likes

Whatā€™s Python 2? <wink>

1 Like

You win :-).

1 Like

From a Unix perspective, why are pipsi and pipx considered ā€˜workaroundsā€™? I havenā€™t looked at their implementations, but the basic idea of installing an app and its dependencies into a separate directory somewhere, and then linking its launcher script to a directory on $PATH seems perfectly legitimate to me.

I think the idea of standalone applications, with dependencies isolated from any other software which might be installed on the system, often gets mixed up with the idea of single file applications, with everything bundled into one binary. There are other reasons you might want a single file application, but you donā€™t need one to achieve a standalone application. E.g. Pynsist produces installers which aim for a standalone application as far as practical, but itā€™s still many files in a directory.

Technically, I think the biggest limitation for standalone applications is how you include the interpreter. Tools like pyenv and conda provide ways to programmatically get a specified interpreter version in a specified path, but theyā€™re not (or not yet?) the kind of de-facto standard people build higher level tools around. Python.org provides the ā€˜embeddableā€™ builds for Windows, but they donā€™t include the MSVC runtime DLLs, so they donā€™t work out of the box on all systems.

1 Like

For me:

  1. Thereā€™s no way to install multiple versions of the same application. This is why I always go back to managing my own virtual environments.
  2. No real way to ensure two installations are the same. Dependencies of Python packages have open specifiers like >=3.0 (or they should), so in (admittedly rare) situations you have the privilege of hunting down weird bugs caused by subtal package combinations.

Essentially all problems come from the fact that these tools are built on PyPI, and PyPI (at least as it currently stands) is fundamentally built for installing packages to work together, and an inappropriate platform to deploy applications. This is my definition of a workaround: It look the part, but is built on sand, and eventually breakd down if you poke hard enough.

Note that a standalone application does not mean a single executable (the other way is true). Most Windows and Mac applications consist of mutiple files extracted into a directory, but they all qualify as standalone. The fundamental attribute is whether those files can work on their own, as an entity.

1 Like

Specifically regarding the package combination problem, various package managers solve this in different ways.

  • Curation (e.g. APT, RPM). A group of people hand-pick a set and promise the combination works, and you trust them.
  • Locking. The maintainer supplies to the index server the preferred versions they want. I believe this is how Ruby solves this? A poor analogy for Python is to supply both pyproject.toml and Pipfile.lock in your package, and pyproject.toml is consulted by pip install, and Pipfile.lock pipx install.
  • Distinct packaging formats. Youā€™d submit sdist and wheel to be used as a library, and another package with dependencies vendored to be used as a standalone application.

And all of them (maybe more) have their tradeoffs.

1 Like

Iā€™m the author of PyOxidizer (https://github.com/indygreg/PyOxidizer), a project that aims to make it easy to embed [C]Python in Rust applications. I introduced that project and described Python packaging problems at https://gregoryszorc.com/blog/2018/12/18/distributing-standalone-python-applications/.

I have a few random thoughts that may be useful for this conversationā€¦

I think a common deficiency with many Python standalone application tools is that they require having a Python distribution on the system. For a ā€œjust worksā€ experience, I think it is important for app distributions to provide the runtime. In this case a Python distribution. This is a ā€œhard problemā€ because building portable binaries is hard. Extra so for CPython because of limitations in its build system and 3rd party library dependencies. (I have the python-build-standalone project (https://github.com/indygreg/python-build-standalone) to make this easier.) Tools like Briefcase punt on this problem and delegate to a shebang or a Python launcher.

One of my discoveries with PyOxidizer is the significant performance speedup from importing modules from memory (https://gregoryszorc.com/blog/2018/12/28/faster-in-memory-python-module-importing/ and https://gregoryszorc.com/blog/2019/01/06/pyoxidizer-support-for-windows/). Unfortunately, importing modules from something that isnā€™t a real filesystem can confuse all kinds of Python packages which assume <module>.__file__ isā€¦ a filesystem path. There may need to be some PEP work to formalize that __file__ and __path__ are abstract concepts not bound to traditional filesystems. Maybe allow them to be URIs or something, Iā€™m not sure. I think the importing abstractions with the decoupling of finders and loaders are mostly good and flexible enough. But they do seem to be rather traditional filesystem focused. And code in the wild reflects this. Auditing tools should probably be taught to look for __file__ abuse and to steer people towards using the resources API (importlib.abc.ResourceReader) if they arenā€™t doing so already.

Something else that I would find extremely useful for PyOxidizer (and would be useful for any standalone packaging tool) is to make it easier for these tools to find all required resources (namely package dependencies) for packaging. Today, tools tend to do things like invoke setup.py to populate a virtualenv then consume that. Or they inject themselves into the setup.py process to gather the information they need. It would be super useful to get a build system metadata dump from the packaging tool so app packaging tools could consume this and do useful things. I think some of the recent work around abstracting the build process can help here. But Iā€™m not too familiar with that work. Keep in mind that tools like PyOxidizer need to know extensively about C extensions and binary dependencies so they can be compiled in a portable manner. Complex setup.py scripts that invent their own build steps which can do complicated things undermine standalone application packaging.

Another feature that would be useful is a standard mechanism for declaring app packaging details. Today, various tools have to invent their own setup.py extensions or config files for packaging. It would be useful to have a standard grammar such that most applications/libraries could define the settings once and be packaged using N tools. This may not get the long tail of specialized applications (like say Mercurial). But Iā€™d like to think it would handle 95% of applications.

Iā€™m trying to make https://github.com/indygreg/python-build-standalone and PyOxidizer loosely coupled. The output of python-build-standalone contains a machine-readable file describing the Python distribution and its settings (see the README.rst in that repository). This allows PyOxidizer to theoretically consume any Python distribution (CPython, PyPy, etc). It might be worth standardizing such a format for packaging Python distributions themselves so Python app packaging tools can achieve greater flexibility in the Python distribution they use. e.g. if all you need to do is point the packaging tool at a URL for a Python distribution in a standard format, with a trivial change you could replace CPython with PyPy.

I know Iā€™m forgetting a few things. But hopefully this mini brain dump is useful to the discussion.

4 Likes

A community member familiar with ā€œConstructorā€ asked me to chime in here. Constructor is a tool for creating installers that are self-contained. I donā€™t think it really fits the use case declared here, but maybe for some related use cases it works. Unlike pipenv or something like that, youā€™d ship an artifact that would contain all the files necessary to create an environment on the other end, rather than a list of dependencies to install. Constructor requires conda to run, but does not assume that conda ends up in the created installer.

Weā€™ve experimented with nuitka and pyinstaller for getting a standalone conda executable, but ultimately shared libraries were not possible to bundle into a single standalone executable. Nuitka produced a nice folder that had our bundled executable along with a bunch of the core python stack. It was pretty close to being a standalone executable, but would have taken a lot more work to figure out static linking and such to produce a single exe. Having better tools for this would be extremely helpful.

1 Like