Integrate threads into async code

Hello,

I am looking for some guidance on how to properly interact with code that runs in threads in my asyncio application. The thread is producing data for processing (thread → asnycio) and sometimes I want to send data to the thread but not very often.
So of course I searched the web and of course the documentation but found no information that was satisfying.

So I tried out different things and benched them.
First pure asyncio performance - in my case this is 1800 msgs/sec. It can’t get any faster than this.
Then interaction with a(nother) thread pool, raw performance is 31k msgs/sec

  • janus
    library which seems to exist exactly for this use case.
    280 / 320 msgs/sec
  • coro
    I created a coroutine that which will be called from the thread with run_coroutine_threadsafe
    520 / 700 msgs/sec

I really thought a queue would be the appropriate solution here because the thread(s) can feed data into it and it’ll act as a buffer. But it is resulting in a 85% performance penalty which is very disappointing.

Directly calling the coro from thread is much better, but still comes with a hefty performance penalty.

So I am wondering what is the recommended way to pass data between threads and asyncio code.
Thank you for your help!

Additionally I’ve observed that logging is a real bottleneck (disk I/O) so I want to move it into a separate thread.
What would be a good way to pass data from asyncio → thread?
Do I just use a queue.Queue with put_nowait or is there some better solution?

Maybe you are hitting GIL as well with your thread pool generating the data, along with the janus.
janus can be used in both directions: thread → asyncio, asyncio → thread.
I also faced the same logging issue and ended up writing a parent-children process (not thread!) architecture where child processes send log records via a ZMQ socket and the parent process runs the actual stderr/file write handlers.

1 Like