Expanding asyncio support for socket APIs

ccotter · September 20, 2022, 4:23am

The asyncio event loop provides coroutine based versions of socket.recv, and others. Some socket.socket APIs like sendmsg and recvmsg are missing. This enhancement would be to add

loop.sock_sendmsg - async analogue of socket.sendmsg
loop.sock_recvmsg - async analogue of socket.recvmsg
loop.sock_send_fds - async analogue of socket.send_fds
loop.sock_recv_fds - async analogue of socket.recv_fds
more, depending on desire for more complete compatibility with the socket.socket API set.

Some socket APIs like sendmsg and send_fds provide different features that socket.send do not, for example sending ancillary data or sharing file descriptors between processes.

From my understanding, we could implement these proposed loop.sock_[sendmsg/recvmsg/etc] with a similar approach to how asyncio’s sock_send works. Once we’re happy with the set of APIs, I’d be happy to start implementation of this work.

guido · September 20, 2022, 4:41am

Do you have real-world code that would benefit from these new APIs? How much? New APIs are a maintenance burden even if nobody uses them, we’ve learned.

ccotter · September 20, 2022, 5:03am

My use case would be a client of something like sockpool. Sockpool is a server that pools sockets, with clients using sendmsg/recvmsg (in C terms) or the Python convenience wrappers send_fds/recv_fds to acquire and donate socket connections. This is the only specific real-world code that I have (and, thus far, I’ve wrapped my own versions of non-blocking socket.socket APIs that integrate with the event loop).

guido · September 20, 2022, 9:57pm

Let’s see if others have the same use case. If it’s just you, I’d rather not be responsible for such a new feature.

_david · October 10, 2022, 10:33pm

I went looking for these today, and web search turned up this discussion.
I’m writing a network server that load-balances connected clients onto a per-core sub-agent, farming them out using control messages over Unix sockets after accepting the connection in the main process.

What I really wanted was loop.sock_recv_fds_into(), so I can manage the all the buffers myself (to minimize buffer allocations) and get the ease of using the recv_fds rather than wrangling the ancillary data out of recvmsg myself.

So, to be clear, I’d like to see:

loop.sock_recvmsg() (ok, will work with some wrangling)
loop.sock_recvmsg_into() (better performance, ancillary data still needs wrangling)
loop.sock_recv_fds_into() (best performance; least effort: great!)

Without at least loop.sock_recvmsg, it’s not possible to use ancillary data (which requires the recvmsg(2) syscall on Unix) at all via asyncio.

lis · December 11, 2022, 9:24pm

Put me down as one more person surprised to find out that there is no recvmsg() variant for asyncio. I don’t have strong preferences for the exact flavour of the API, but my 2¢:

unsurprisingly, I intend to use this for receiving file descriptors
I’d actually lean somewhat against the “into” variant — it’s more work, and I don’t care about performance that much. I’d honestly be pretty surprised to find out that someone needs to do transfers of file descriptors over unix sockets alongside such large amounts of data, and at such a rate that it would make a difference to have an into variant here.

So for me, either (or both) of the equivalent of socket.recv_fds() or socket.recvmsg().

guido · December 11, 2022, 9:43pm

Have you looked into doing this using add_reader? It would be limited to UNIX-ish systems (add_reader isn’t supported by the Proactor support) but since you’re talking about receiving (and sending?) file descriptors I have a feeling that would work just fine for your use case.

Maybe there could be a small 3rd party package that provides this support based on add_reader?

lis · December 12, 2022, 12:44pm

It’s sort of awkward to do this with add_reader() when you want to do it from an async function, and block the flow of execution there. You just gave me another idea for an interesting pair of functions that might allow for implementing things like this while side-stepping the general need for 100 different variants:

loop.wait_readable(file)
loop.wait_writable(file)

Then you could wait for a particular filelike object to reach a particular state, then perform your operation using the normal socket code.

guido · December 21, 2022, 9:56pm

Sorry for the slow reply. You’re right that working with add_reader() from an async function is slightly awkward – I think it would look roughly like this (untested):

loop = asyncio.get_event_loop()
fut = loop.create_future()
loop.add_reader(fd, fut.set_result, None)
await fut

You’re right that having this wrapped in a helper would be useful, so you could write loop.wait_readable(fd). Of course, the above snippet is easily made into a helper that you could call as await wait_readable(fd). Like this:

def wait_readable(fd):
    loop = asyncio.get_event_loop()
    fut = loop.create_future()
    loop.add_reader(fd, fut.set_result, None)
    return fut

Does that warrant being made into an event loop method? I don’t know – it’s still pretty uncommon to need this low-level code, but if I had to choose between this and your initial proposal, this one is definitely more general and elegant.

If you really wanted to put your weight on the scale you could send a PR with code like this plus some unit tests – it’s always easier to accept a proposal if the code already exists. This doesn’t guarantee acceptance for sure, as the authors of elaborate reference implementations of PEPs that were ultimately rejected can tell you. Not that I’d make you write a PEP for such a simple API addition. But I think you should have to work a bit harder before it’s even considered.

guido · December 21, 2022, 10:03pm

My examples didn’t call remove_reader(fd), thereby proving that it’s more complicated than I imagined. That makes me like the proposed feature more – it encapsulates non-trivial logic!

LukasWoodtli · January 16, 2024, 9:04am

We would also benefit from an async implementation of loop.sock_recvmsg. In our application we need some additional data from the socket like the senders IP address, the receivers IP address and the traffic class.

We implemented an async wrapper for the socket.recvmsg function. There we get the payload and the ‘from address’ directly from the snychronous socket API. The other information is retrieved from the ancillay data.

It would be nice to have this functionality directly in the Python standard library.

guido · January 16, 2024, 10:03pm

Okay, I’ll bite. Do you want to help out by sending a PR?

LukasWoodtli · February 1, 2024, 11:13am

I created a draft PR here:

github.com/python/cpython

Support async `recvmsg` and `sendmsg`

main ← LukasWoodtli:gardena/lw/async_recvmsg

opened 11:09AM - 01 Feb 24 UTC

LukasWoodtli

+157 -0

# async `recvmsg` and `sendmsg` Following the discussion here: https://discuss.…python.org/t/expanding-asyncio-support-for-socket-apis/19277 Implemented async variants of: `loop.sock_sendmsg` - async analogue of `socket.sendmsg` `loop.sock_recvmsg` - async analogue of `socket.sock_recvmsg` This is a draft PR open for discussion. Some open tasks for this draft: * Improve documentation * More testing * Platform independence (developed only on Linux)

Only for loop.sock_recvmsg and loop.sock_sendmsg though (as that’s what we need in our application).

Currently, it has just basic tests and was only tried on Linux. The documentation is also not very good, yet.

But I hope it is at least a good starting point for future work.

guido · February 19, 2024, 12:50am

If anyone else would like to see this, please help by reviewing or otherwise helping out on the PR @LukasWoodtli links above. (Note: approving a PR without giving feedback is not helping.)