How do I locally host a PyPi repository on an air-gapped server?

How do I locally host a PyPi repository on an air-gapped server, and populate the repository with some popular Python packages ?

Is there a way to:

  1. Export the current state of a Python virtual environment to a file (eg. interpreter version, all modules etc)
  2. Transfer the export file to a laptop where pip can access the internet
  3. Run pip against the export file (as a baseline reference) to download required packages
  4. Import the downloaded packages to the locally hosted PyPi repository

You’re responsible for maintaining an appropriate version of the Python interpreter (usually via a package manager for your platform).

To export a list of packages currently installed to a file named requirements.txt, do:

pip freeze > requirements.txt

After that, you can refer to the Installing from local packages section of pip’s documentation for instructions on how you can download packages and install them locally.

You can also build your own PyPI repository with the pip2pi project.

Maybe some ideas here:

Bandersnatch is a tool that can be used for this.

The workflow goes something like:

  1. On an online machine, create a bandersnatch configuration file defining the packages you want to mirror.
  2. Run an initial mirror command, this will create a directory structure in the mirror folder similar to this
  3. Copy the generated web root to your offline environment
  4. Set up a web server like nginx or apache to serve the web folder as static files
  5. In your offline environment, configure pip to use the web servers host and path as its default index (see pip index-url and configuring pip, you can use an environment variable or config file)

The web root created by Bandersnatch follows the Simple Repository API structure. pip can understand and use this directory structure by just setting the index URL, whether accessed via file system/NFS or HTTP. Although not the most efficient without additional configuration, for e.g. top 500 packages plus transitive dependencies it is more than adequate.

On subsequent runs of bandersnatch mirror on the online machine, you can use the diff-file configuration option to have it output a list of the new files it downloads. You can use this to transfer only the new/changed files over to your offline network - they can be extracted over top of the existing ones.

1 Like