(Not sure if this belongs in “help” or “packaging”.)
I am trying to understand workflows around virtual environments, prompted a little bit by the discussion around single-file scripts.
There seems to be a general understanding at least among the cogniscenti here that virtual environments are the way to go for pretty much all Python use — we shouldn’t “pollute” the main Python install with third-party packages.
I use python for a significant fraction of my programming. I am a physicist, and most of that programming is either in the form of Jupyter notebooks or medium-sized programs, with quite a few very short single-file scripts that do a single job as part of the mix.
I find my workflow is a very bad fit for virtual environments. Creating a new venv for each new program or script seems quite heavyweight. Jupyter, in particular, does not play all that well with venvs. It works, but it’s several extra steps for each new environment, and as far as I can tell each one semi-permanently pollutes the list of available Jupyter kernels. Perhaps a fix for this part of the problem needs to come from the Jupyter development side?
Moreover, almost all of my work uses the same mix of packages. One recent counter-example is Numba, which is usually pinned to the not-quite-latest version of numpy. So I’ve actually created a new venv — and Jupyter kernel — for just this, but this means I need to know ahead of time if I’m going to need it, since sometimes there are advantages to using the latest version of numpy rather than whatever Numba is pinned to.
I’m not completely sure where I’m going with this, but I suppose I wanted to clarify the underlying philosophy here. The Python Tutorial discusses virtual environments solely (?) as a solution to the problem of clashing dependencies. Is that why we should use them? Or do they have other advantages?
Slightly, or more than slightly, controversially, I feel that they are a bad solution to the dependency problem, if that really is their only raison-d’etre, although of course I am not sure that I have a better one. The most obvious other kind of proposal is a modification to import
, which puts the burden on the calling code rather than the package itself (although within a distributed package it seems simple enough).
Thanks for your thoughts on the matter.