Expanding asyncio support for socket APIs

The asyncio event loop provides coroutine based versions of socket.recv, and others. Some socket.socket APIs like sendmsg and recvmsg are missing. This enhancement would be to add

  • loop.sock_sendmsg - async analogue of socket.sendmsg
  • loop.sock_recvmsg - async analogue of socket.recvmsg
  • loop.sock_send_fds - async analogue of socket.send_fds
  • loop.sock_recv_fds - async analogue of socket.recv_fds
  • more, depending on desire for more complete compatibility with the socket.socket API set.

Some socket APIs like sendmsg and send_fds provide different features that socket.send do not, for example sending ancillary data or sharing file descriptors between processes.

From my understanding, we could implement these proposed loop.sock_[sendmsg/recvmsg/etc] with a similar approach to how asyncioā€™s sock_send works. Once weā€™re happy with the set of APIs, Iā€™d be happy to start implementation of this work.

3 Likes

Do you have real-world code that would benefit from these new APIs? How much? New APIs are a maintenance burden even if nobody uses them, weā€™ve learned.

My use case would be a client of something like sockpool. Sockpool is a server that pools sockets, with clients using sendmsg/recvmsg (in C terms) or the Python convenience wrappers send_fds/recv_fds to acquire and donate socket connections. This is the only specific real-world code that I have (and, thus far, Iā€™ve wrapped my own versions of non-blocking socket.socket APIs that integrate with the event loop).

Letā€™s see if others have the same use case. If itā€™s just you, Iā€™d rather not be responsible for such a new feature.

I went looking for these today, and web search turned up this discussion.
Iā€™m writing a network server that load-balances connected clients onto a per-core sub-agent, farming them out using control messages over Unix sockets after accepting the connection in the main process.

What I really wanted was loop.sock_recv_fds_into(), so I can manage the all the buffers myself (to minimize buffer allocations) and get the ease of using the recv_fds rather than wrangling the ancillary data out of recvmsg myself.

So, to be clear, Iā€™d like to see:

  • loop.sock_recvmsg() (ok, will work with some wrangling)
  • loop.sock_recvmsg_into() (better performance, ancillary data still needs wrangling)
  • loop.sock_recv_fds_into() (best performance; least effort: great!)

Without at least loop.sock_recvmsg, itā€™s not possible to use ancillary data (which requires the recvmsg(2) syscall on Unix) at all via asyncio.

Put me down as one more person surprised to find out that there is no recvmsg() variant for asyncio. I donā€™t have strong preferences for the exact flavour of the API, but my 2Ā¢:

  • unsurprisingly, I intend to use this for receiving file descriptors
  • Iā€™d actually lean somewhat against the ā€œintoā€ variant ā€” itā€™s more work, and I donā€™t care about performance that much. Iā€™d honestly be pretty surprised to find out that someone needs to do transfers of file descriptors over unix sockets alongside such large amounts of data, and at such a rate that it would make a difference to have an into variant here.

So for me, either (or both) of the equivalent of socket.recv_fds() or socket.recvmsg().

Have you looked into doing this using add_reader? It would be limited to UNIX-ish systems (add_reader isnā€™t supported by the Proactor support) but since youā€™re talking about receiving (and sending?) file descriptors I have a feeling that would work just fine for your use case.

Maybe there could be a small 3rd party package that provides this support based on add_reader?

Itā€™s sort of awkward to do this with add_reader() when you want to do it from an async function, and block the flow of execution there. You just gave me another idea for an interesting pair of functions that might allow for implementing things like this while side-stepping the general need for 100 different variants:

  • loop.wait_readable(file)
  • loop.wait_writable(file)

Then you could wait for a particular filelike object to reach a particular state, then perform your operation using the normal socket code.

1 Like

Sorry for the slow reply. Youā€™re right that working with add_reader() from an async function is slightly awkward ā€“ I think it would look roughly like this (untested):

loop = asyncio.get_event_loop()
fut = loop.create_future()
loop.add_reader(fd, fut.set_result, None)
await fut

Youā€™re right that having this wrapped in a helper would be useful, so you could write loop.wait_readable(fd). Of course, the above snippet is easily made into a helper that you could call as await wait_readable(fd). Like this:

def wait_readable(fd):
    loop = asyncio.get_event_loop()
    fut = loop.create_future()
    loop.add_reader(fd, fut.set_result, None)
    return fut

Does that warrant being made into an event loop method? I donā€™t know ā€“ itā€™s still pretty uncommon to need this low-level code, but if I had to choose between this and your initial proposal, this one is definitely more general and elegant.

If you really wanted to put your weight on the scale you could send a PR with code like this plus some unit tests ā€“ itā€™s always easier to accept a proposal if the code already exists. This doesnā€™t guarantee acceptance for sure, as the authors of elaborate reference implementations of PEPs that were ultimately rejected can tell you. Not that Iā€™d make you write a PEP for such a simple API addition. But I think you should have to work a bit harder before itā€™s even considered.

My examples didnā€™t call remove_reader(fd), thereby proving that itā€™s more complicated than I imagined. That makes me like the proposed feature more ā€“ it encapsulates non-trivial logic!

We would also benefit from an async implementation of loop.sock_recvmsg. In our application we need some additional data from the socket like the senders IP address, the receivers IP address and the traffic class.

We implemented an async wrapper for the socket.recvmsg function. There we get the payload and the ā€˜from addressā€™ directly from the snychronous socket API. The other information is retrieved from the ancillay data.

It would be nice to have this functionality directly in the Python standard library.

1 Like

Okay, Iā€™ll bite. Do you want to help out by sending a PR?

I created a draft PR here:

Only for loop.sock_recvmsg and loop.sock_sendmsg though (as thatā€™s what we need in our application).

Currently, it has just basic tests and was only tried on Linux. The documentation is also not very good, yet.

But I hope it is at least a good starting point for future work.

1 Like

If anyone else would like to see this, please help by reviewing or otherwise helping out on the PR @LukasWoodtli links above. (Note: approving a PR without giving feedback is not helping.)

2 Likes