Why the major release
PEP-405 defines Python virtual environments, and people generally tend to believe it’s a solved problem set. I am a maintainer of the tox project, in which step 1 is usually creating virtual environments. Drawing from two and a half years of maintainership of that project, I identified three main pain points:
-
Creating a virtual environment is slow (takes around 3 seconds, even in offline mode; while 3 seconds does not seem that long if you need to create tens of virtual environments, it quickly adds up).
-
The API used within PEP-405 is excellent if you want to create virtual environments; however, only that. It does not allow us to describe the target environment flexibly or to do that without actually creating the environment.
-
The duality of virtualenv versus venv. Right, python3.4 has the venv module as defined by PEP-405. In theory, we could switch to that and forget virtualenv. However, it is not that simple. virtualenv offers a few benefits that venv does not, and I’ve talked about this in my EuroPython presentation:
- Ability to discover alternate versions (
-p 2
creates a python 2 virtual environment,-p 3.8
a python 3.8,-p pypy3
a PyPy 3, and so on). - virtualenv packages out of the box the wheel package as part of the seed packages, this significantly improves package installation speed as pip can now use its wheel cache when installing packages.
- You are guaranteed to work even when distributions decide not to ship venv (Debian derivates notably make venv an extra package, and not part of the core binary).
- Can be upgraded out of band from the host python (often via just pip/curl - so can pull in bug fixes and improvements without needing to wait until the platform upgrades venv).
- Easier to extend, e.g., we added Xonsh activation script generation without much pushback, support for PowerShell activation on POSIX platforms.
- Ability to discover alternate versions (
In 2018 October, I’ve also become a maintainer of the virtualenv project. The project was in on life-support for years and a good reason. We had much code that did not have any tests, and the project is a single file runtime (with 2.6k lines of code, plus a whole lot code that’s embedded as base64). It’s a long script in which at various points multiple if/else statements try to cater the logic to the current interpreter type and platform. This made adding Jython, IronPython, or even improving PyPy support very hard. There have been a few rewrites that attempted to fix this; notably, Donald Stufft got reasonably far. Nevertheless, the creators of these eventually moved on to other projects, so the rewrites never got promoted.
What changed
I present to you my attempt at the rewrite. Initial design goals were published under this RFC issue. The goal of the rewrite was not just to improve ease of maintenance of the project, but also address most of the above pain points:
- Moved away from the single file format: allows separating virtual environment creation logic per target interpreter (CPython 2, CPython3, PyPy2, PyPy3 supported).
- Python 3 virtual environments created by virtualenv are now
pyenv.cfg
based (in essence, they are equivalent to venv). - Python 2 virtual environments created by virtualenv are now
pyenv.cfg
based. Instead of injecting our ownsite.py
now, we only add a slight shimsite.py
that fixes thesys.prefix
andsys.exec_prefix
by readingpyenv.cfg
and then delegates site-packages setup by triggering the import of the hostsite.py
). Python 2 virtual environments now look a lot like Python 3 venv as a side effect. CPython 2 might now be EOL, but this is very handy for PyPy2 still supported, and for anyone still stuck on CPython 2 for any reason. - Add a venv based creator, this in essence, delegates virtual environment creation to the target pythons venv module (note we still control activation script generation, pip/setuptools/wheels seeding). Default is still the builtin method, but one can select this mode via the
--creator venv
(mostly because calling processes can be expensive, especially on Windows). - CentOS/Fedora pythons supported (all other platforms should be too, now we no longer assume via if/else what the platforms folders are but instead use sysconfig/distutils to query the python interpreter about where things should go).
- Be upfront about what interpreter we support and what we don’t. When we discover an interpreter, we check if our expectations about the interpreter are meet. Microsoft Store Python is now supported, we automatically discover that it does not support our builtin virtualenv generation method (as the python executable is read-only), and provide only the venv route.
- Provide a Describe interface, that provides information about a virtual environment without creating it.
- Significantly improved activation scripts that now support Unicode (emoji) characters. If the file system can encode a character, you can pass it.
- Historically adding the seed packages (pip/virtualenv/setuptools) has been done by invoking
pip
and pointing it to the embedded wheels via--find-links
. This is now available under the--seed pip
flag. - The default seed mechanism is now
--seed app-data
. This new model tries to address the performance issues mentioned at the start of this post. 98 percent of those three seconds (on Linux at least, Windows is even slower) is spent on installing the seed packages. Instead of always installing packages from scratch, we use a cache. The first time we are installing a seed package, we’ll install it into the user application data folder, and make it read-only. Finally, instead of installing it into the virtual environments pure library path (oftensite-package
), we link it from the app-data. We also improved the wheel extraction mechanism, getting it down to 1.8 seconds. The first virtual environment creation will still be slow (2 seconds). However, subsequent ones will run in just 50 milliseconds. - zipapp support. The advantage of a single file mode was that it was accessible to bootstrap virtualenv itself. You just downloaded
virtualenv.py
and you were good to go. To mitigate the fact that now we have multiple files and multiple dependencies, we now ship a zipapp - 20.0.0b1 version available here, that one can use the same fashion. Downloadvirtualenv.pyz
point that to a python interpreter, and you should be good to go. - All CLI defaults can be changed via
virtualenv.ini
inside the user config folder (or use an environment variable to specify the location of this). - Now extensible via package entry points (install packages alongside virtualenv to enable):
- interpreter discovery mechanism (you have some custom logic specifying where you can find compatible pythons - use this),
- virtual environment creation logic (want to load Python from a database, sure thing!),
- seed package creation (you have a better idea than the app data design described above, try it by writing your own)
- activation scripts (you have a new shell, create your own activator script via this).
These are just some of the changes. The idea is that this package should be at CLI fully compatible with virtualenv 16.x
. Yet, within has many improvements.
Call for feedback
I released today beta 1 with the hope that some people can try it out and report back bugs they find. Once we fix all the issues people run into, we’ll release it as version 20.0.0
. The rewrite branch within the virtualenv repository will become the master, with the master moving to legacy. A final note, that documentation has not been updated yet, but I’ll try to work on this in the following days.
- PyPi
20.0.0b1
- https://pypi.org/project/virtualenv/20.0.0b1/#files - Zipapp
20.0.0b1
- https://drive.google.com/open?id=1RPoLprfsexuO-AEFcpdSB2DupMdsWcgC (hosted on my personal Google Drive)