Expanding asyncio support for socket APIs

The asyncio event loop provides coroutine based versions of socket.recv, and others. Some socket.socket APIs like sendmsg and recvmsg are missing. This enhancement would be to add

  • loop.sock_sendmsg - async analogue of socket.sendmsg
  • loop.sock_recvmsg - async analogue of socket.recvmsg
  • loop.sock_send_fds - async analogue of socket.send_fds
  • loop.sock_recv_fds - async analogue of socket.recv_fds
  • more, depending on desire for more complete compatibility with the socket.socket API set.

Some socket APIs like sendmsg and send_fds provide different features that socket.send do not, for example sending ancillary data or sharing file descriptors between processes.

From my understanding, we could implement these proposed loop.sock_[sendmsg/recvmsg/etc] with a similar approach to how asyncio’s sock_send works. Once we’re happy with the set of APIs, I’d be happy to start implementation of this work.

2 Likes

Do you have real-world code that would benefit from these new APIs? How much? New APIs are a maintenance burden even if nobody uses them, we’ve learned.

My use case would be a client of something like sockpool. Sockpool is a server that pools sockets, with clients using sendmsg/recvmsg (in C terms) or the Python convenience wrappers send_fds/recv_fds to acquire and donate socket connections. This is the only specific real-world code that I have (and, thus far, I’ve wrapped my own versions of non-blocking socket.socket APIs that integrate with the event loop).

Let’s see if others have the same use case. If it’s just you, I’d rather not be responsible for such a new feature.

I went looking for these today, and web search turned up this discussion.
I’m writing a network server that load-balances connected clients onto a per-core sub-agent, farming them out using control messages over Unix sockets after accepting the connection in the main process.

What I really wanted was loop.sock_recv_fds_into(), so I can manage the all the buffers myself (to minimize buffer allocations) and get the ease of using the recv_fds rather than wrangling the ancillary data out of recvmsg myself.

So, to be clear, I’d like to see:

  • loop.sock_recvmsg() (ok, will work with some wrangling)
  • loop.sock_recvmsg_into() (better performance, ancillary data still needs wrangling)
  • loop.sock_recv_fds_into() (best performance; least effort: great!)

Without at least loop.sock_recvmsg, it’s not possible to use ancillary data (which requires the recvmsg(2) syscall on Unix) at all via asyncio.

Put me down as one more person surprised to find out that there is no recvmsg() variant for asyncio. I don’t have strong preferences for the exact flavour of the API, but my 2¢:

  • unsurprisingly, I intend to use this for receiving file descriptors
  • I’d actually lean somewhat against the “into” variant — it’s more work, and I don’t care about performance that much. I’d honestly be pretty surprised to find out that someone needs to do transfers of file descriptors over unix sockets alongside such large amounts of data, and at such a rate that it would make a difference to have an into variant here.

So for me, either (or both) of the equivalent of socket.recv_fds() or socket.recvmsg().

Have you looked into doing this using add_reader? It would be limited to UNIX-ish systems (add_reader isn’t supported by the Proactor support) but since you’re talking about receiving (and sending?) file descriptors I have a feeling that would work just fine for your use case.

Maybe there could be a small 3rd party package that provides this support based on add_reader?

It’s sort of awkward to do this with add_reader() when you want to do it from an async function, and block the flow of execution there. You just gave me another idea for an interesting pair of functions that might allow for implementing things like this while side-stepping the general need for 100 different variants:

  • loop.wait_readable(file)
  • loop.wait_writable(file)

Then you could wait for a particular filelike object to reach a particular state, then perform your operation using the normal socket code.

1 Like

Sorry for the slow reply. You’re right that working with add_reader() from an async function is slightly awkward – I think it would look roughly like this (untested):

loop = asyncio.get_event_loop()
fut = loop.create_future()
loop.add_reader(fd, fut.set_result, None)
await fut

You’re right that having this wrapped in a helper would be useful, so you could write loop.wait_readable(fd). Of course, the above snippet is easily made into a helper that you could call as await wait_readable(fd). Like this:

def wait_readable(fd):
    loop = asyncio.get_event_loop()
    fut = loop.create_future()
    loop.add_reader(fd, fut.set_result, None)
    return fut

Does that warrant being made into an event loop method? I don’t know – it’s still pretty uncommon to need this low-level code, but if I had to choose between this and your initial proposal, this one is definitely more general and elegant.

If you really wanted to put your weight on the scale you could send a PR with code like this plus some unit tests – it’s always easier to accept a proposal if the code already exists. This doesn’t guarantee acceptance for sure, as the authors of elaborate reference implementations of PEPs that were ultimately rejected can tell you. Not that I’d make you write a PEP for such a simple API addition. But I think you should have to work a bit harder before it’s even considered.

My examples didn’t call remove_reader(fd), thereby proving that it’s more complicated than I imagined. That makes me like the proposed feature more – it encapsulates non-trivial logic!