Have there some like Goroutines in Python 3.13 or maybe 3.14

jefer94 · May 15, 2024, 1:21am

I’m referring to A Tour of Go and I’m not saying that we must use the go keyword but should be fine to have something to manage the multithreading automatically, but… which should be the approach?

csm10495 · May 15, 2024, 1:46am

We have asyncio, but that is async, while goroutines can be generally more parallel. We have concurrent.futures to help us do similar things.

It’s more possible for it to be performant once nogil is the norm. Without that, threads are limited by the gil, and processes have their own overhead.

In the current gil world, how would it work?

blhsing · May 15, 2024, 2:03am

I agree that in a post-nogil world of Python there should be simple native statements and primitives that make coroutines a true first-class citizen of the language so their lifecycles and communications can be managed with language constructs instead of clunkier function calls. Good time to get the brainstorming started.

Zac-HD · May 15, 2024, 2:23am

I’m -1 on this idea, for the reasons described in Notes on Structured Concurrency.

jefer94 · May 15, 2024, 2:42am

That I wanted to say was, that it’s easier to write parallel code in Go, and I think that it’s unnatural, I mean, Python is a language for scripting and Go is compiled, typed, and closer to the low level, with the death of GIL, our event loop should run in more than 1 thread, and it should be indirectly an implementation of parallel code, we’d have async code and Python for us would our event loop using parallelism run many futures at the same time, with 0 configurations, I think that it’s that we want to.

This is related to Make Python 3.13 coroutines work with multithreading automatically like Goroutines does and some like `go` keyword for later. · Issue #119061 · python/cpython · GitHub

Rosuav · May 15, 2024, 3:02am

Python’s asyncio is in a bit of a weird place. There’s a certain amount of language-level support for async functions and the like, but actually using them is distinctly quirky. In JavaScript and Pike, I can do something like this (syntax tweaked to Python’s style):

async def func1():
    await long_process()
    print("Done process 1")

async def func2():
    func1() # no await
    await other_long_process()
    print("Done process 2")

and when you call func2(), it’ll spawn func1() asynchronously. That task automatically runs to completion, but the caller doesn’t wait for it. It’s extremely easy, and very convenient.

But in Python, there are two problems. Firstly, you have to explicitly asyncio.create_task(func1()) instead of just calling it. That’s not a deal-breaker but it’s annoying. Secondly, and much more importantly, is this warning: Coroutines and Tasks — Python 3.12.3 documentation You have to save a reference to the task object. Why do I need to track that? Isn’t it the event loop’s job to do that?

So, yes, it’s easier to write parallel code in a lot of other languages. Python is a bit behind on this. I could fairly easily write a wrapper that calls create_task and adds them to the set exactly as in the docs example, and in fact have done exactly that here, but I can’t make it so that simply calling an async function will behave correctly.

Would removing the GIL help with Python’s asynchronicity? Probably (although, at what cost to single-task performance?). Would an easier way to actually write parallel code help? Almost certainly.

jefer94 · May 15, 2024, 4:45am

I see your observations as accurate, I think that if an async function was called, and it wasn’t awaited, it must return the coroutine object that should be received by another function that adds conditions for the cancellation, so, if this condition is met, the task must be canceled, if that doesn’t happen, it must be executed, technically we should add all those coroutines in an array and wait for them at the end of the function, but maybe, we would rather that this function return its result before those functions are executed, I like can cancel a task, but I think that the API could be better with the death of GIL.

storchaka · May 15, 2024, 5:44am

Asynchronousity in Python is not related to the GIL. Even without GIL it cannot be run concurrently. It is the main point that the flow of the asynchronous code can be switched only at particular points (await, async for, async with). The code that does not contain such points is considered atomic. Changing this will break all asynchronous code that was written before. It would also eliminate the main reason of using the asynchronous code instead of threads.

go foo(a, b) is equivalent to Thread(foo, args=(a, b)).start(), except that in Go it can use the green threads. Supporting the green threads is difficult while Python is implemented in C, and this is not related to the GIL.

jefer94 · May 15, 2024, 7:17am

I don’t know, I understand that when we call an async function, the execution and the result is unitary, but, the event loop should use multiple cores, and the only solution that I think for it is that the event loop use concurrency, it should receive the input, return a future that should be consumed or not, I don’t know which concurrent model should be better for it, if the event loop just could use a thread (or whatever that you decide to use) to become it useful I’d have that write my workers with a event loop to use my CPU completely, I think that it’s so strange, if I have 1000 futures, I’d like that my CPU was completely used, I mean, each CPU thread must be used, I mean, if 1000 clients is waiting for be attended, there have 20 workers and just 1 is working, we’d feel like something is wrong there.

Rosuav · May 15, 2024, 7:19am

That’s what threads are for. That’s not what async functions are for.

jefer94 · May 15, 2024, 8:26am

I’m pretty confused, in Node when I call n async functions, there are happening n + 1 things at the same time until the main function waits for them, it’s called concurrency, it’s happening more than one line of execution in this code, even you could put prints in each one of n functions and they will appear in an unpredictable order.

I could implement an async flow using threads in Node, the main reason to use async functions instead of threads in Node is that it’s considerably easier to do with it, I technically should wait them.

Daverball · May 15, 2024, 8:39am

Concurrency is not the same thing as parallelism. Concurrency just means the execution of multiple coroutines/fibers happens interleaved, when one of them waits on an I/O operation it will yield execution back to the event loop so another one can be executed while it is waiting, but they aren’t executed at the same time.

You can combine coroutines with threads to combine concurrency with parallelism, although with the GIL there is currently no benefit to that, since threads would yield at the same time as a coroutine would^[1], so there’s very little parallelism outside of C-extensions or multiprocessing.

Since async routines are non-blocking a thread would actually never yield with fully async code, so it would only be regularly suspended by the scheduler, so all you’re doing is adding scheduling overhead for no benefit, as long as the GIL is in effect ↩︎

layday · May 15, 2024, 1:15pm

I don’t believe this is accurate for JavaScript. The func1() call doesn’t create a task at all and will run synchronously up to and including the await statement. What comes after the await is executed in a task, for each await, recursively. In essence, what comes after the await is a function supplied to Promise.then, executed as a microtask. This is fundamentally a different model to Python’s, with tasks only being created on demand and scheduled for execution on the one true event loop, which is always running. I do quite like how JS suffers less from the “coloured function” problem seemingly, but I also find this model to be more difficult to grok and less amenable to structured currency – task cancellation is basically unsolved in JS.

Rosuav · May 15, 2024, 6:17pm

Try it out I’ve used this feature a lot in JS. The call to func1() will return a promise, but func2 won’t await that promise, and both will happen concurrently. This is the same as would happen in Python if you call create_task and then stash the task into a list or something, but otherwise, it’ll get dropped on the floor unceremoniously.

Suggestion: New function in asyncio or a method on the event loop:

_tasks = []
def spawn(awaitable):
    _tasks.append(create_task(awaitable))

plus some error handling and disposal. Then you can asyncio.spawn(func1()) and it’ll correctly run it in “fire and forget” mode.

layday · May 15, 2024, 7:03pm

This is true.

But this is not quite accurate.

Try running these in JS and Python, respectively:

async function foo() {
  console.log("1");
}

async function bar() {
  foo();
  console.log("2");
}

bar();

import asyncio


async def foo():
    print('1')


async def bar():
    asyncio.create_task(foo())
    print('2')


asyncio.run(bar())

Notice that “1” is printed before “2” in JS. That’s because foo() will in fact run to completion before “yielding” to bar; it is not scheduled at all. In JS, scheduling is left entirely up to the Promise executor. In Python, the task will be scheduled immediately, and foo() will run only when bar() yields.

elis.byberi · May 15, 2024, 7:22pm

Yes, a Promise cannot be directly cancelled.

Note that async/await is just syntactic sugar for Promise. If you call an async function (without waiting), it will return the Promise itself without waiting for it.

Example:

async function fetchData() {
  const response = await fetch('https://www.google.com');
  const data = await response.text();
  console.log(data);
}

var data;
async function callingFetch() {
  fetchData(); // Call fetchData without await
  console.log("Fetching data...");
}

callingFetch();

Fetching data... is printed before HTML text.

layday · May 15, 2024, 7:29pm

Yes, that’s because fetch schedules a task. The order of execution in your example above is:

callingFetch()
fetchData()
fetch(‘https://www.google.com’)
console.log(“Fetching data…”)

In Python, using asyncio.create_task, 3 and 4 would have been reversed.

elis.byberi · May 15, 2024, 7:43pm

As Rosuav mentioned, you can achieve the same order of execution if you ‘forget’ about the task:

import asyncio


async def task1():
    print("Task 1 started")
    await asyncio.sleep(1)
    print("Task 1 completed")


async def task2():
    print("Task 2 started")
    await asyncio.sleep(2)
    print("Task 2 completed")


async def main():
    print("Main started")
    # Create tasks concurrently using create_task
    task1_instance = asyncio.create_task(task1())
    task2_instance = asyncio.create_task(task2())
    
    # Wait for all tasks to complete
    await task1_instance
    # await task2_instance
    print("Main completed")
    
    await asyncio.sleep(3)


asyncio.run(main())

…prints:

Main started
Task 1 started
Task 2 started
Task 1 completed
Main completed
Task 2 completed

layday · May 15, 2024, 7:54pm

I’m really not sure what this is meant to show. You are yielding control back to the loop with asyncio.sleep(1), which allows task 2 to start, and then again with asyncio.sleep(3) which allows task 2 to finish. How is this artificial example relevant to what I’m saying?

elis.byberi · May 15, 2024, 8:00pm

In Python, it is very easy to follow async code. It is the same as if it were synchronous. Using async/await makes it non-blocking.

The analogy pertains to “Main completed” as “Fetching data…” and “Task 2 completed” as “HTML text”.