Is it safe to hardlink files between .venvs?

Given two separate virtual environments, and in each of them a file with the exact same contents somewhere under .venv/lib/pythonX.Y/; is it generally safe to replace one of them with a hard link to the other?

I’d generally expect this to be safe, but just asking if anyone knows a reason it’s not.

I work in data science, and generally have a lot of “small” projects, each with their own virtual environment (often managed by Poetry). Several of the dependencies installed are quite large, such as PyTorch or Cuda libraries.

I did a quick search for .venvs under my home directory. The 5 largest ones have a combined size of 27 gigabytes. By hard linking all the identical .so files, this can be reduced to about 12 gigabytes. This is the file type I feel the safest to hard link; I’ve seen myself at times accidentally edit a .py file in a .venv, so maybe I would not hard link those…

I think it is safe. You might be interested in uv which does this deduplication by default if I understood correctly. Also I think it might be the default behavior in the “conda world”.

1 Like

I would install a new Python that’s separate to the system one. Then install all the shared dependencies into that one globally, and make venvs from it using --system-site-packages.

What you’re asking is exactly what conda does with the environments it makes.

It keeps the environments leaner than they appear with a simple ls -lh.

This is an interesting read