With PEP 703 (free-threaded Python), threads can now run truly in parallel without the GIL. This changes the calculus for web frameworks:
- WSGI + ThreadPool: Pure sync, minimal overhead, direct parallelism
- ASGI + async: Event loop has overhead when dispatching to threads via run_in_executor()
In my benchmarks, a pure sync ThreadPoolExecutor approach is 3-4x faster for CPU-bound handlers than ASGI + threadpool, due to async boundary overhead.
Questions:
- Should new frameworks for free-threaded Python be sync-first (WSGI)?
- Is there work on optimizing ASGI for thread-based handlers?
- Is a new standard needed that’s thread-native?