Add Virtual Threads to Python

The problem is: I call a 3rd party library function foo.bar(). Is that function a switch point? What about in the next version of bar? What if it adds logging output? Is that now a switch point? What if I had logging turned off, but now I turn it on? In what libraries did I just add switch points? At least with function coloring it’s trivial to answer these questions.

9 Likes

I understand your concern, but my point is that it usually doesn’t matter.

def query(url: str) -> None:
    res = requests.get(url)
    foo.bar(res)
    log.info('request complete')

Every one of those calls could be async, but so what? In the broader context of my program, query() will run in a coroutine and it will switch a couple of times. The only thing I really need to worry about are shared resources and blocking. The former is required whether you’re working with threads, asyncio, gevent, etc…. The latter is required for any non-preemptive multitasking.

Further, no I don’t know at a glance that foo.bar() is a switch point, but if it really matters, I can always find out!

edit: To extend this just slightly, since we’re talking hypotheticals… foo.bar() might also add .1s of CPU processing to every call. That would have a much more significant effect on my async program than a new switch point.

1 Like

Agreed, but now the problem is “just add locks like you would for multi-threaded code”, which the industry does not have an awesome track record with!

In practice, not any more than you would have to with any async framework. The vast majority of the time it will be very clear which library calls are likely to do IO and which are not, as well as which are likely to keep state and which are not. A dict keeps state, but is unlikely to switch. log.info might switch, but it is unlikely to keep state. Libraries that keep state internally will likely protect it with locks, because threading is still a reasonable thing to do.

Regardless, the somewhat greater safety of explicit switching is just not worth the cost.

1 Like

Everything becomes problematic when abused. Switching to an asynchronous environment is quite straightforward and doesn’t require building a complex async call tree. I’m pretty sure that if threading were used everywhere, it would quickly become annoying. I’m not entirely sure what virtual threads are, as the original post isn’t clear on that, but introducing a different approach to concurrency with the possibility of parallelism would be a significant advantage.

It would be nice if we spent more time developing abstractions like concurrent.futures and point beginners to them instead of asyncio.

with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]

seems like a really nice abstraction which is in my opinion for the simple use quite superior and more reasoning-able than involving the whole asyncio infrastructure.

2 Likes

This kind of example that looks like it would be entirely IO bound looks like the exact thing that asyncio does better than threading.

Less than 150 LoC gives you a well-typed wrapper around asyncio similar to thread executors. This particular wrapper runs an event loop in a single thread, and can give back awaitables or concurrent futures.

assuming a coroutine equivalent of load_url, here’s what the equivalent use to your example is (note: your example seems a little incomplete, you assign url in a loop and never use it…)

with threaded_loop() as bg_loop:  # see link above
    future_to_url = {bg_loop.schedule(load_url(url, 60)): url for url in URLS}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]

All of the task running happens in that background thread, which means your sync code plays nicely with it (no await necessary from the caller) and the overhead of an event loop in a thread is generally smaller than that of an executor.

Maybe it’s less that we shouldn’t tell people to use asyncio, but that more utilities like what exists for threading need to be standard in asyncio as well.

4 Likes

By “function colouring between sync/async” I meant ecosystem fracture as well. So we’re complaining about the same thing.

As for what can mend this fracture, this project or a similar one could GitHub - gfmio/asyncio-gevent: asyncio & gevent in harmony

Just to note, I don’t agree that asyncio is fracturing or harmful to the Python ecosystem in any way. Sometimes silence is mistaken for agreement, but that’s not the case here. It’s simply an effort to stay on topic. That said, if there are concerns about asyncio, let’s discuss them in a separate thread.

This also means I don’t share the rationale in the original post. A new approach to concurrency doesn’t need to solve a problem, and presenting it doesn’t require diminishing others’ work.

2 Likes

My original post may have come across as harsh, so allow me to say that I appreciate the motives of the asyncio authors, and I do think there has been a benefit in terms of learning, exploring concepts, and introducing many others to concurrency. While I do think the consequences of introducing function coloring were evident, it wasn’t obvious how significant an effect it would have. I hold that asyncio has fractured the ecosystem, and has been a net negative (so far), but I’ll allow that one possible way to mend the fracture could be a standard event loop.

That said, and to return to the topic at hand, I think this thread is suggesting that we learn from and move past asyncio and alternatives like gevent to create an even better, and more integrated option. Especially with the prospect of free-threading.

My understanding of virtual threads (as they’re implemented in Java) is that they’re based on continuations (fibers, greenlets, etc…) but provide a traditional threading API and support for structured concurrency. They have no special syntax for context switches, and therefore do not introduce function coloring. Just like gevent, switching happens implicitly on IO or yields (aka sleep). Some call this well-defined, others do not, but it is not explicit.

Sidenote: I found this quote discussing alternatives to virtual threads in JEP 444:

Add syntactic stackless coroutines (i.e., async/await) to the Java language.

It would split the world between APIs designed for threads and APIs designed for coroutines, and would require the new thread-like construct to be introduced into all layers of the platform and its tooling. This would take longer for the ecosystem to adopt, and would not be as elegant and harmonious with the platform as user-mode threads.

So no, virtual threads would not solve the problem of explicit switch points like asyncio does. What it does offer, however, is a lightweight and safer alternative to threads. There is still a pivot point between grokking where the switches happen (I do understand why that makes some people uncomfortable) vs. locking as you would with threads. Personally, I think the uncertainty/learning curve is a price worth paying for cleaner, more readable codebases. That, and not having to write two versions of everything.

As @encukou called out above:

We accept quite a lot of uncertainty in our programs all the time. Perhaps the reason some single out switching from the rest is because it’s hard to understand concurrency when you first encounter it. But it’s understanding concurrency that’s hard, not spotting switch points.

I think it’s worth reviewing all of the following for a deeper understanding of where we are with all of this (not allowed to link them):

  • Unyielding (Glyph, 2014)
    • Why we have asyncio
  • What Color is Your Function? (Bob Nystrom, 2015)
    • Why people don’t like asyncio
  • Pull Push: Please stop polluting our imperative languages with pure concepts-Curry On (Ron Pressler, 2015)
    • Fundamental concepts. Highly recommended
  • Notes on structured concurrency, or: Go statement considered harmful (Nathaniel J. Smith, 2018)
    • Structured concurrency makes asyncio (and other concurrency models) a bit better
  • Playground Wisdom: Threads Beat Async/Await (Armin Ronacher, 2024)
  • From Async/Await to Virtual Threads (Armin Ronacher, 2025)
    • Extensions of the discussion in this thread

I’m happy to see that revived! And I hope it continues to get support. The last time I tried to use it, it wasn’t working and looked abandoned.