Installing packages close to the project root

I was going to file a bug report (installation options are poorly documented). But it doesn’t seem like documentation reports are expected. So I think I bring it up here. My goal is to run a site locally in a docker container, but make the packages easily available from the host:

services:
    site:
        ...
        volumes:
            - .:/site

My idea is to have them at ./site-packages. For that I’m going to set PYTHONPATH=site-packages. And create setup.cfg:

[install]
install-lib=./site-packages
install-scripts=./site-packages/bin
install-data=./site-packages

It took me a while to find this solution. Using --target is the closest supported option, but I have to specify it every time. And to be honest, it looks like a kludge. It doesn’t check if the package is already installed, and gives a warning in this case. It doesn’t notify if a script is not on PATH. And it potentially might let one install a package for an unsupported version of Python.

More on it here.

It feels to me you’re re-inventing virtual environments. Could you describe more why virtual environments do not suit your use case?

Also, all of the path-related install options are being deprecated in pip, we really really don’t want you to use them.

A while ago I migrated a project to docker. To sum up the relevant parts, I was installing packages with -t site-packages and setting PYTHONPATH=site-packages. I believe it took me some time back then to find this way as well. But recently I forgot to pass the -t part. And so I thought there must be a better way. I couldn’t find much documentation, so I started inspecting the code. (Looking ahead, it’s not clear even now what exact cases --target, --prefix, --root are trying to cover.) The result can be observed in that gist on GitHub.

Why do I need docker? It makes the development environment closer to the production one (e.g. in terms of OS packages). It simplifies migrations. E.g. I recently had to migrate from Python 3.5 to Python 3.6. With docker it would be a matter of changing a couple of lines in the Dockerfile and deploying the site. It’s a well known tool, meaning possible new participants wouldn’t have much trouble with it. Last but not least, it’s a not so new shiny toy to play with :slight_smile: That is apparently a joke to some extent. Although there might be cases where I use it, which is not strictly necessary.

Having that said, I could make use of venv and have the packages at ./env/lib64/pythonX.Y/site-packages. But then it would be virtual environment inside virtual environment. I’d have to probably activate it in the Dockerfile (not sure, it may be more complex than that). Another option would be to let it install packages where it does by default (/usr/local/lib/pythonX.Y/site-packages). But this part is not visible from the host. Okay, if I need to look at the code I can just run another vim in the container. Which is not something to fear of. I do that… often? occasionally? But with the abovementioned setup I can have packages at ./site-packages, and visible from the host.

And yes, I understand that the custom installation scheme is not supported. Which is certainly a downside of my solution. But I’m thinking to give it a try. After all, why not? :slight_smile: Well, I should have placed there a disclaimer saying:

The stunts are performed by trained professionals, do not try this at home.

But for now I don’t know what can go wrong.

Using a virtual environment (venv) within docker is 100% OK, and I recommend doing that. It helps insulate your application from the system Python environment in the container, preventing accidental changes to the system packages.

What you’re trying to do is recreate something similar to a virtual environment. I’d suggest using it directly rather than trying to create a hacky alternative using install-options.

Virtual environments are less magical than it seems to be. Essentially it’s simply providing an alternative way to run Python that automatically sets sys.prefix for you, so you don’t need to remember passing --prefix to pip or setting PYTHONPATH on python yourself. You don’t even need to activate it. Something like this would work fine, and almost transparently:

WORKDIR /my_project
RUN python -m venv ./env

# Automatic `pip install --prefix=./env`!
RUN ./env/bin/pip install ...

# Automatic `PYTHONPATH=$PWD/env/lib/python3.9/site-packages`!
# (Not technically true; it is actually *smarter* than PYTHONPATH.)
RUN ./env/bin/python ...

It is really fundamentally designed to solve your exact use case. But sometimes those fancy advertising around the feature make it (somehow ironically) difficult for professionals to realise what it actually is :slightly_smiling_face:

I guess specifying precise versions in Dockerfile (like, FROM python:3.6-alpine3.12) eliminates the need to insulate from the system Python environment. As long as you do it both in development and production.

Indeed, sounds like something I’d recommend over my solution for development environment. Although env/lib/pythonX.Y/site-packages is quite a lot of typing. But you’re avoiding some questions.

What are the use cases --target, --prefix, and --root are covering?

What are the downsides of using setup.cfg? One I can name myself. It needs more knowledge about how things work on the part of the developer. Are there others?

And there’s probably one more thing. The reason we’re discussing it. I saw no mentions of a recommended setup for docker in the documentation. Or to put it another way, what can make developers not search for some hacky solutions like I did? Like, some part of the documentation that says, “For docker do this…” (authoritative source)

They are for deploying to a location that don’t want to, or even can’t be a virtual environment. One common usage of --target, for example, is to install packages into a embeddable package, which pip can’t run inside.

A use case of --prefix and --root is for short-lived, probably in-process temporary environments that you have no problem passing arguments and set up environment variables repeatedly (persuambly beause it’s done by a script or function); these options are a bit more performant than python -m venv.

In general, behaviours depending on an external file are considered more difficult to maintain due to additional context switches. They are subtle comparing to options visible directly from the command (either by passing options like --prefix or calling specific commands like in virtual environments), because a future maintainer (including youself some time later) must inspect the whole environment instead of simply read a few lines of command (or even just one) to get a whole picture of what’s going on. In a workflow scenario (such as the Dockerfile use case), explicit commands are less error prone than external configuration files. (It is not to say that configuration files are useless; they are still very useful in day-to-day workflows, and pip provides support for that.)

Because all the recommendations are the same, with or without Docker. We recommend using virtual environments outside of Docker. And we recommend virtual environments inside a Docker container. You (among many others) seem assume things must be different in a container because… I don’t know. They are not.

3 Likes

Because of absolute paths, it’s easiest if the venv path within the container is the exact same as on the host.

For development, venv in a container is great; but it’s no substitute for an actual build, release, and deploy workflow (wherein there is a build output build artifact that is one* zipapp or a package; possibly build in a previous stage of a multi-stage docker file).

File permissions from pip installs default to the umask’d permissions; whereas OS packages should set permissions so that the app can’t overwrite itself.

FPM can easily build packages from a virtualenv directory.

When you install a versioned build output like pkgname-v0.1.1-a646bcda.ext, you can tell what commit hash a bug appears in.

If you build portable manylinux wheels with e.g. the auditwheel manylinux container, they may work in Alpine containers due to MUSL instead of manylinux-standard libc.
https://github.com/pypa/manylinux#docker-images

If you build portable manylinux wheels with e.g. the auditwheel manylinux container, they may work in Alpine containers due to MUSL instead of manylinux-standard libc.
https://github.com/pypa/manylinux#docker-images

https://github.com/pypa/pip/issues/3969#issuecomment-247381915 :

manylinux1 support isn’t globally enabled on all Linux instances. We do some checks to try and figure out if the platform is manylinux1 friendly or not (mostly related to checking glibc version) but those checks are not very strict. I think it’s really just the glibc version check.

You can override our detection logic (assuming you’re still on a Linux at all that is) by writing a _manylinux.py file somewhere on sys.path and adding a single variable to it:

{sys.path}/manylinux.py

manylinux1_compatible = True # or False if you

Sorry for the delay… I’d like to summarize things, the way I see them now. Though first I’d like to present another gist. It describes the approaches to running sites under Docker locally that I’m aware of. Because what follows depends on it.

So, on one hand you can indeed create a virtual environment in a container. Which sounds somewhat superfluous to me. Meaning, as long as you’re pretty specific about the image (e.g. python:3.9-alpine3.13), there should be no reason to isolate yourself from the system Python. But by means of using a virtual environment you can make it visible from the host.

On the other hand you can install the packages to their default location (/usr/local/lib/pythonX.Y/site-packages), without creating a virtual environment. That way they won’t be visible on the host. But you can always launch vim or something in the container and inspect them all you want. And under macOS you are probably out of other options. I mean you can create a volume with a virtual environment, but there seems to be no slightest benefit to it in this case. Creating a virtual environment is an extra effort, but the packages won’t be visible from the host with or without it.

Then, as @uranusjr clarified, --target is for embeddable packages. --root and --prefix… is probably for e.g. building OS packages (see link c) or something? --root allows one to install files to a temporary root filesystem (make a directory that looks like a root filesystem), archive the directory and obtain a file (an archive) as a result that when unpacked puts the files in the proper places. --prefix is needed if you want to put the files at some custom location, like /opt, under that temporary root filesystem. Although at least in Arch Linux they prefer… setuptools, or distutils? I’m not sure. Which kind of makes sense. python setup.py install --root=tmproot is like building a package from source, pip install --root tmproot ...?.. like installing packages to be used by something else? For example, by a Python script, that wasn’t published on pypi(dot)org for whatever reason, but uses some Python packages. To build an OS package for the script it makes sense to use pip install --root=... to install the dependencies.

Let me first make it clear the way I see the solution from the original post now:

FROM python:X.Y-alpineZ.A
ENV PYTHONPATH site-packages
COPY .pydistutils.cfg ~/
...

With that you can do docker-compose exec site pip install -r requirements.txt to install the packages. ~/.pydistutils.cfg makes the packages get installed into site-packages, PYTHONPATH makes python find them.

Why do I not do pip install in a local (development) Dockerfile? Because if I later bind-mount . (host) into /site (container), then the virtual environment at /site/env exists in the image, but visible neither from the container, nor from the host. I.e. I need to do pip install after launching a container anyway, why do it in Dockerfile then?.. The effect is nullified by bind-mounting a project root into the container.

But indeed with the original solution one must know what PYTHONPATH is, what the content of .pydistutils.cfg means. And indeed without a virtual environment one would generally look for packages either at /usr/local/lib/pythonX.Y/site-packages, or at ~/.local/lib/pythonX.Y/site-packages. Why they are not there and what affected it, it might not be easy to find out. That can be somewhat mitigated by adding comments, but still a downside.

First, because it sounds like a virtual environment inside a virtual environment (docker). Also, probably because Python is somewhat different compared to other languages in this respect. With Ruby you have bundler which is like pip to Python. And you have chruby/rvenv/rvm which let you have several versions of Ruby installed alongside one another. So venv (probably) naturally comes off as something akin to the latter (chruby/…). Since a virtual environment contains its Python (or so it seems) and all. Then I saw no chruby/… stuff in docker containers. Neither the case is with Node.js, PHP, and probably others. Which made me think virtual environments are not needed there as well. And by the way, no venv in the docker tutorial (see link d).

I believe once such command gave me a message along the lines of something being missing in pip._internal, but that was probably a rather old pip.

Why? I can see only one reason if you see a path in the output, you can use it as is both on the host and in the container. Which is kind of mild, but okay, an upside.

P.S. Maybe something is different about the way we use Docker for development, and as a result we’re having a hard time understanding each other about the pip/venv stuff.

Since I can’t put more than 2 links in a post, here are the rest of the links:

c: svntogit-community/PKGBUILD at 7a9b126a326cbbc1f1710e978f2e886331e40c08 · archlinux/svntogit-community · GitHub
d: Quickstart: Compose and Django | Docker Documentation

As I said, what you’re trying to do is basically how virtual environments work. You seem to have already decided you don’t want to use standard virtual environment tools to achieve you goal. I don’t think I have anything more to suggest, and will say farewell and good luck.