Measure time delay in Eventloop picking a resolved task from ready queue

I’m optimizing the performance of a Python application that includes many non-blocking I/O operations, some of which involve making HTTP API calls. We use an asynchronous client for all I/O operations, including database and Redis access. During load testing with concurrent users, we’ve noticed that processing times significantly increase with the number of users. How can we determine if this slowdown is due to delays in API calls or if there is a significant delay in the event loop picking up tasks that have been resolved and added to the ready queue?

I faced the same issue with the twisted async scheduler.
What I ended up doing was adding instrumentation into the twisted code.
I timed every single event that was dispatched and logged “slow” event handlers. Also recorded was the time an event was queued so that we could see that events where taking a long time to handled.

The length of the outstanding events queue was also valuable information that I tracked and reported on.

I have not looked at the python stdlib scheduler, but assume you could do the same type of instrumentation in it.

You could probably get those metrics by registering a task factory.