Slow startup on Windows for virtual environments

paugier · December 30, 2024, 9:05pm

I found out that Python startup is very slow on Windows for virtual environments.

When no environment are activated, Mesure-Command { python -c pass } gives after few iteration something like 65 ms. Interestingly, when an environment is activated, the same command gives ten times more (typically 0.6 to 0.7 s).

A practical consequence is that Python applications installed with pipx or uv starts very slowly.

I tried with virtual env created with venv, virtualenv, pipx and uv, and I get the same results. In contrast, conda environments do not suffer from this issue (but they are not really Python envs).

I tried by using the full path towards the interpreter in the virtual env and I also see the problem.

I didn’t find anything on this subject on the web so my first guess is that I’m doing something wrong. But what could be wrong? Is it a known issue ? Can anyone with a machine on Windows try this?

# repeat few times
Mesure-Command { python -c pass }
python -m venv tmp-startup
.\tmp-startup\Scripts\Activate.ps1
# repeat few times
Mesure-Command { python -c pass }

I tried python -X importime -c pass but it does not seem to explain the difference. So I don’t understand what could explain the difference. Maybe I should have a look at how are implemented virtual env on Windows.

jeff5 · December 31, 2024, 8:21am

On my creaky, old machine (AMD Athlon II X4, Win10, CPython 3.12.8) I’m getting 435ms and 570ms, so only about 30% slower.

Perhaps there is a cache somewhere that the venv makes ineffective? Or what if the system Python is already in memory?

… Yes, a little experimentation seems to support that last idea. If I start a second shell window, the results in the first window are affected enormously by whether I run Python (just the REPL waiting at the prompt) in the second window, and whether it is the system or the venv version. (There’s a copy of the .exe in the venv directory, but it shares DLLs etc with the system Python.)

My guess is that your very short, non-venv startup time is because your system Python is already mostly in memory, due to use elsewhere, in an IDE maybe. (I have an IDE open, but it is running a 3.11 exe.)

JamesParrott · December 31, 2024, 10:42am

After a few goes, I got two similar results in 3.14.0a3

> Measure-Command {%LOCALAPPDATA%\Programs\Python\Python314\python.exe -c "pass" }
...
TotalMilliseconds : 101.8554

(the full path was used, not %LOCALAPPDATA%. Powershell wouldn’t like that.)

> Measure-Command { c:\path\to\venvs\test3.14\Scripts\python.exe -c "pass" }
...
TotalMilliseconds : 107.0215

(venv made afresh, using \venvs> python -m venv test3.14)

paugier · December 31, 2024, 11:07am

Thanks for these hypotheses.

I don’t know what could use the Python executables used to create the virtual envs. As I tried to explain, I also get this behavior with Python installed with UV, and I don’t see how they could be used for anything.

In C:\Users\me\AppData\Roaming\uv:

.\python\cpython-3.13.1-windows-x86_64-none\python.exe -c pass takes ~ 0.08 s
.\tools\black\Scripts\python.exe -c pass takes 0.5 s.

I also tried to use .\tools\black\Scripts\python.exe in another terminal and it changes nothing.

Note that Python startup to do nothing (-c pass) should be very short of the order of few ms so 80 ms is not “very short”. On other similar machines on Linux, it takes something like 15 ms.

0.5 s is too long. It seems to me that something special is done in the venv case.

barry-scott · December 31, 2024, 11:11am

I can reproduce differences in startup time, but not the slowness you are seeing. My CPU is an intel 12th Gen Intel(R) Core™ i7-12700K.

Using this script in a Windows 11 VM running under KVM on Fedora 41 I see these results:

#!/usr/bin/env python
import sys
import subprocess
import time

def main( argv ):
    num_trials = 10
    trail_time = 0.0
    for trial in range(num_trials):
        s = time.time()
        subprocess.run([argv[1], '-c' 'pass'])
        e = (time.time() - s)*1000
        trail_time += e

    avg =  trail_time / num_trials

    print(f'ran {argv[1]} in average {avg:.3f}ms')
    return 0

if __name__ == '__main__':
    sys.exit( main( sys.argv ) )

C:\Users\barry>py -m venv qqq

C:\Users\barry>py t.py qqq\Scripts\python.exe
ran qqq\Scripts\python.exe in average 24.559ms

C:\Users\barry>py t.py py
ran py in average 33.191ms

C:\Users\barry>py t.py c:\Python313.win64\python.exe
ran c:\Python313.win64\python.exe in average 18.096ms

Under Fedora 41 with the same script I see:

$ python3 t.py qqq/bin/python3.13
ran qqq/bin/python3.13 in average 12.635ms

$ python3 t.py /usr/bin/python3.13
ran /usr/bin/python3.13 in average 11.218ms

jeff5 · December 31, 2024, 11:32am

I don’t totally follow this. The venv Scripts directory contains a copy of python.exe (and pythonw.exe), used when the venv is active. My hypothesis is that the system Python is in use somewhere, and so a second process is able to start quickly.

Edit:
In @barry-scott’s test, the executable invoked is in use running the test …
looking properly, I realise that’s not so.

It would be very short on my machine. Relatively short, if you prefer.

JamesParrott · December 31, 2024, 11:34am

but not the slowness you are seeing

100ms isn’t an unacceptable UX for any Windows user. Human reaction times are ~250ms.

For a $200 laptop, running a recent Windows 11 home edition, with several Chrome and Firefox tabs open, and about 100 other background processes (I roughly counted), I’m very happy it’s as fast as it is!

I can easily make things a lot slower than that, believe me ;-).

4GB RAM. Intel(R) Celeron(R) 6305 @ 1.80GHz. SSD (perpetually almost full).

barry-scott · December 31, 2024, 12:25pm

I reran the tests using python3.10 to run the t.py script and see the same results.

You have assumed that if the executable is running else where that will speed up the test. And it will, but simply the fact that it ran recently is enough to have Windows and Linux kernels to cache the executable in memory.

It is the caching that is the big win. Even if the executable is running else where if there is memory pressure python’s pages can be dropped from memory.

paugier · December 31, 2024, 12:41pm

100ms isn’t an unacceptable UX for any Windows user. Human reaction times are ~250ms.

100 ms for just starting Python without any import is quite long because this adds on with import times of libraries and because a CLI is not only call by humans. For example editors like VSCode call git status and git diff. CLI have to be responsive. However, I agree that 0.1 s is still fine.

In contrast, 0.5 s is unacceptable for most command line applications. Consider if you have to wait half a second or more for each git calls. If it is normal to have such numbers for Python, one should just stop using Python for a lot of CLI, and Mercurial should just be fully ported in Rust.

For my case, there is clearly a big issue (that I don’t yet understand) because python can start on this computer in less than 0.1 s and in virtual envs, it starts in typically 0.6 s.

paugier · December 31, 2024, 1:12pm

My hypothesis is that the system Python is in use somewhere

UV and miniforge do not use the system Python. For UV, the base Python executables are in C:\Users\me\AppData\Roaming\uv\python and they are used by nothing.

As I said, I reproduce this behavior with conda-forge Python executables and UV Python executables. I don’t think they share anything in terms of DLLs.

Starting the base Python executables is relatively performant. Starting all python.exe in virtual envs is much slower.

The hypothesis “base Python already used” seems a bit strange to me because I don’t see why all these different interpreters would be used (and I get similar results for these different interpreters).

Moreover, the startup time for Python .exe in virtual envs is stable (after few calls). Even if this Python executable is also used in another terminal.

This issue still needs to be understood, and a fix needs to be found.

It might be something silly in my setup but one needs to understand what happens and why Python in venvs cannot be nearly as efficient than base Python executables.

jeff5 · December 31, 2024, 1:31pm

That’s what I thought might be true, but repeated runs did not make much difference, and what I observed on my machine was that starting a REPL made a large difference (e.g. -50%) to the time to run -c pass, as long as it was the same executable. That was interesting I think.

One version on one machine, of course.

JamesParrott · December 31, 2024, 1:41pm

Ah OK. I don’t know what venv does, but within the pip source code, when installing a library with a CLI entry point (in project.scripts) on Windows, where as it would be simple just write a batch file (on Posix pip just makes a .sh file), pip actually forms a skeleton Windows .exe. Part of it involves building xml of the required structure.

This won’t explain why your uv venvs are also slower on your machine, but have you compared virtualenv and venv venvs?

paugier · January 1, 2025, 8:17am

Yes I did. Same results

barry-scott · January 1, 2025, 12:49pm

Try running your test with python -v -c pass to see all the modules that are loaded. Are any of them your customisations?

paugier · January 2, 2025, 1:07pm

Interestingly, the first line printed (import _frozen_import # frozen) is printed after a short delay which seems to explain the difference of startup time. First, nothing happens during a small time (typically 0.5 s) and then, the imports seem to be done at the same speed (just an impression, no measurement of course).

So the difference seems to be about something done before the first import…

paugier · January 2, 2025, 8:49pm

I notice two interesting new facts:

python.exe in the root Python is not the same thing that python.exe in the virtual environment. The two files have not the same size.
I can also see a very similar issue without virtual environment. If I pip install black in the root Python:

$ Measure-Command { C:\Users\me\miniforge3\envs\env-empty\Scripts\black.exe --version }
TotalMilliseconds  : 820
$ Measure-Command { C:\Users\me\miniforge3\envs\env-empty\python.exe --version }
TotalMilliseconds  : 86
$ Measure-Command { C:\Users\me\miniforge3\envs\env-empty\python.exe -c "import black as m; print(m.__version__)" }
TotalMilliseconds  : 250

The first result is really bad. 0.8 s to print a version is much too long. In particular if we can print the same version with Python in 0.250 s!

It seems that the .exe file created by pip during the installation of black (black.exe) is also very inefficient (loosing approximately 0.5s at each call). Why?

It seems that black.exe and python.exe in a virtual environment do something before starting, which takes here approximately half a second. Or maybe Windows does something before launching them really.

paugier · January 2, 2025, 8:56pm

Even better:

$ Measure-Command { C:\Users\me\miniforge3\envs\env-empty\Scripts\black.exe --version }
TotalMilliseconds  : 820
$ Measure-Command { C:\Users\me\miniforge3\envs\env-empty\python.exe -m black __version__ }
TotalMilliseconds  : 250

The two commands do exactly the same thing!

JamesParrott · January 3, 2025, 9:33am

What results do you get with an official Python install, not a miniforge3 one? Third party repackagers could’ve done anything. And if you’re using a Python installed by conda, isn’t it best to use conda to create a venv using Conda?

methane · January 3, 2025, 9:56am

github.com

python/cpython/blob/main/PC/venvlauncher.c

/*
 * venv redirector for Windows
 *
 * This launcher looks for a nearby pyvenv.cfg to find the correct home
 * directory, and then launches the original Python executable from it.
 * The name of this executable is passed as argv[0].
 */

#define __STDC_WANT_LIB_EXT1__ 1

#include <windows.h>
#include <pathcch.h>
#include <fcntl.h>
#include <io.h>
#include <shlobj.h>
#include <stdio.h>
#include <stdbool.h>
#include <tchar.h>
#include <assert.h>

This file has been truncated. show original

This is python.exe in venv.

barry-scott · January 3, 2025, 10:39am

Are you running anti virus software? Is it scanning DLLs as you run the tests?