How to delete wheel, setuptools and pip packages from the base python installation in a Docker image

I’m interested in eliminating the wheel, setuptools, and pip packages from the base Python installation in /usr/local/lib/python3.9/site-packages after installing some packages in a virtual enviroment. What is the correct way to delete them in a Dockefile recipe files?

I was thinking about:
RUN python -m pip --python /usr/local/bin/python uninstall wheel setuptools pip --yes

Is that the right way? Should I worry about other things?

You probably won’t gain much by deleting pip/wheel/setuptools. Deleting files that are present in a docker layer just creates a new layer with the delete, it doesn’t remove them from the earlier layers of the image.
pip and setuptools are present in all of the official base Python images so even if you delete them in one of your own layers they’re still going to be part of the image that has to be downloaded. The actual container built from the image will be slightly smaller but not the image itself.

If you want to build a docker image where you install something then remove it later there are two ways that will actually achieve that:

One way is to write a very long RUN command containing multiple commands. You can download and install all of the support packages, then do whatever you needed them to do, and remove the unwanted stuff all in a single RUN command. That creates only one layer which contains only the files added/removed or changed from the previous layer.

The other way which generally works out better (IMHO) is to build multiple images in a single Docker file:
Start with a very minimal base, call it ‘base’, then ‘FROM base AS build’ install compilers, libraries, whatever else you need to be able to install the packages you actually want and install the stuff you want. Now start a new image ‘FROM base AS final’ and use the ‘COPY --from build’ command to pull exactly the packages you need from the intermediate image.

I usually do multi image builds creating a virtualenv in the build image and installing all of the required packages, then the final image just needs to do COPY --from=builder --chown=app /venv /venv and you have a working Python virtualenv in the final image without intermediate crud.

Here’s the sort of thing I do. Note that it installs poetry to handle the package installation but there won’t be any poetry in the final image. It may also upgrade pip but the final image just has whatever pip was in the base image.:

# syntax=docker/dockerfile:1
FROM python:3.11 as base

WORKDIR /app
RUN useradd -m app && chown -R app:app /app

FROM base as builder

ENV PIP_DEFAULT_TIMEOUT=100 \
    PIP_DISABLE_PIP_VERSION_CHECK=1 \
    PIP_NO_CACHE_DIR=1

RUN pip install --upgrade pip poetry
RUN python -m venv /venv
ENV PATH="/venv/bin:${PATH}"
ENV VIRTUAL_ENV=/venv

COPY poetry.lock pyproject.toml /app/
RUN poetry config virtualenvs.create false \
    && poetry install -v --no-dev

COPY --chown=app . /app/
RUN poetry build && pip install dist/*.whl

FROM base as final

COPY --from=builder --chown=app /venv /venv
USER app
WORKDIR /app
ENV PATH="/venv/bin:${PATH}"
ENV VIRTUAL_ENV=/venv

CMD ["/venv/bin/python", "-m", "myapp"]
2 Likes

Thank you for your insights:

  1. I’m already exploiting a multi-stage Docker build as suggested by your own example.

  2. I understand that the official Python Docker image used as base in my own Dockerfile already contains those base packages. So the best thing I can do with little effort is to remove those packages so that the container deployed in production won’t have them.

Out of curiosity, have you ever considered using “distroless” Python images?
Here a reference: GoogleContainerTools/distroless: :avocado: Language focused docker images, minus the operating system. (github.com)

1 Like

I hadn’t come across those distroless images, thanks they look interesting (though I see it says the Python one is experimental).

I guess the other easy thing you could do is just copy and edit the Dockerfile for one of the official Python images. They all seem to install pip as the last thing so by removing that step you could have most of a slim or alpine Python but without pip as your base then just do the pip installation as the first step in the build image.

One thing you could also do is build the virtual environment using --without-pip (venv) or --no-pip --no-setuptools --no-wheel (virtualenv), and use pip --python /path/to/venv/bin/python to install in the virtualenv (needs pip 22.3 or later).

Thank you for your insights, I already install the pip deps in a virtual env with the --without-pip flag, and then install them specifying the external python interpreter, e.g. --python ..., which has the pip manager. The thing is that, when coping the virtual env folder onto the runtime image, based on Python, that context has pre-installed wheel, setuptools and pip packages system wide and I wish to purge them.