Asyncio RLock - reentrant locks for async Python

I’d like to re-open the discussion started here https://github.com/python/asyncio/issues/439, which requested reentrant locks to be added to asyncio.

GvR argues that locks in eventloop-based concurrency are less used because the context switching happens more controlled, and I agree. But there are still use-cases where a complex state change in a component involves I/O, requiring async locking (and no-one argued against that).

What I don’t understand is how those observations make a point against RLocks in async. It’s true that RLocks are not a necessity. But that holds for async as well as threaded concurrency. Their just a convenience. You could always refactor your code to not need them (example given in the same issue). In the most extreme case, that refactoring could mean implementing your own RLock, which isn’t hard and only requires 1. a simple mutex/lock 2. a way to identify the current thread/task.

So they’re not necessary, but they do help to reduce the complexity of code and make it more robust, concise, and easier to reason about.

I ended up implementing my own async RLock and it was a somewhat alien experience having to do that because I’m very used to Python being that language that comes with batteries included.

Worth noting that I’m not the only one that felt the need for async RLocks. There’s even a package to ship this bit of code: asyncio-rlock (imagine the link to pypi here, I was only permitted to post up to 2 links)

Any functionality in stdlib has its own maintenance cost.
If a third-party library on PyPI solves your needs – that’s fine, please use it.
When we start feeling the high demand for the library, this situation can be a reason for embedding it into stdlib.

For asyncio-rlock I see very few usages on the github. Maybe the feature request is not very popular?

1 Like

For asyncio-rlock I see very few usages on the github. Maybe the feature request is not very popular?

Might be. Or maybe people just wrote their own implementation. Or they swallowed the pill of having to refactor their code to work with simple locks.

Looking at Sign in to GitHub · GitHub you can see 4 libs or so that rolled their own solution. Possibly more that I didn’t find because they could’ve named their class whatever.

Given that it’s not a lot or complex code and there is some (hard to measure) demand, I think it would be a cheap and worthy addition to asyncio.

To elaborate on my use-case: I had a decently sized, sync code-base that I wanted to port to async Python. The code used threading.RLock. So I had the option to rethink and refactor the whole thing to only use simple locks or write my own RLock. As said it felt very awkward to have async ship without batteries.

1 Like

Given the async code is not using threads you do not need a thread type lock.
You can use normal python variables you hold state that you check.
Complex I/O needs one or more state machines I find, not locks.

If you do have threads then you could use the thread locks in your async code. But must use mechanisms that allow communication with the threads that do not block the event loop.

To anyone who thinks locking in asyncio is simple, check out the git history of asyncio.Semaphore.

1 Like

I have created an asyncio.RLock implementation internally to avoid adding another external dependency. It is widely used to avoid all the shadow methods (i.e. remove() and _remove(), with remove() being locked because it is public and _remove() being used internally with one surrounding lock used by the caller) you get when using a regular asyncio.Lock to avoid deadlocking.

I completely agree with @robsdedude that similar reasons for needing this apply both in thread-land and task-land. asyncio copied nearly every primitive and method interface. Leaving this out of asyncio seems a bit unintuitive and inconsistent.

1 Like

I guess i am missing something.
In my mind locks only have meaning between threads.
Why does an event loop want a lock at all?
Surely that will block the event loop defeating the purpose of async?

Why does an event loop want a lock at all?

For the same reasons you might want locks in threaded code: to protect critical sections of code which might lead to a corrupted state otherwise.

Here is a (somewhat long) example where an async lock is useful.

In the output, you can see how publishing of the create message completes after publishing of the delete message when no lock is used. When a lock is used, the messages are published serially.

import asyncio

# lets say we have a collection of resources and a pub/sub system
# which we want to send all signals related to those resources to

# lets also say that we want all messages (created, deleted, updated, etc.)
# to be sequential for each resource so that, for example a deleted message
# does not arrive to some subscribers before the created message.
# to do that, we use a lock for each resource.


async def publish(msg, delay):
    print(f"starting publish: {msg}")
    await asyncio.sleep(delay)
    print(f"ending publish: {msg}")


async def create_resource(resource_lock):
    print("create - wait lock")
    await resource_lock.acquire()
    print("create - acquire lock")
    try:
        print("create - resource")
        # once we have created the resource, we have to finish publishing
        # or the other parts of the system wont know about it
        # we have to hold the lock or publishing of the created message
        # may interleave with publishing of the deleted message
        await publish("create - resource", delay=3)
    finally:
        resource_lock.release()
        print("create - release lock")


async def delete_resource(resource_lock):
    # the lock is outside the shield so we can still cancel waiting on it
    # if there is contention with the lock. maybe we come back later.
    print("delete - wait lock")
    await resource_lock.acquire()
    print("delete - acquire lock")
    try:
        print("delete - resource")
        await publish("delete - resource", delay=2)
    finally:
        resource_lock.release()
        print("delete - release lock")


class DummyLock:
    async def acquire(self):
        ...

    def release(self):
        ...


async def main():
    print("\nwith lock...")
    resource_lock = asyncio.Lock()
    creator = asyncio.create_task(create_resource(resource_lock))
    await asyncio.sleep(0)
    deleter = asyncio.create_task(delete_resource(resource_lock))
    await asyncio.wait([creator, deleter])

    print("\nwith no lock...")
    resource_lock = DummyLock()
    creator = asyncio.create_task(create_resource(resource_lock))
    await asyncio.sleep(0)
    deleter = asyncio.create_task(delete_resource(resource_lock))
    await asyncio.wait([creator, deleter])


asyncio.run(main())

Output:


with lock...
create - wait lock
create - acquire lock
create - resource
starting publish: create - resource
delete - wait lock
ending publish: create - resource
create - release lock
delete - acquire lock
delete - resource
starting publish: delete - resource
ending publish: delete - resource
delete - release lock

with no lock...
create - wait lock
create - acquire lock
create - resource
starting publish: create - resource
delete - wait lock
delete - acquire lock
delete - resource
starting publish: delete - resource
ending publish: delete - resource
delete - release lock
ending publish: create - resource
create - release lock

Surely that will block the event loop defeating the purpose of async?

All the asyncio primitives are specially designed using futures to avoid blocking the event loop. When one of them would block, a future is created in the background and the event loop moves on to other tasks and comes back to the task when the future is done (cancelled or result is set).

If you were to use threading primitives in asyncio code, that would block the event loop, but that is why asyncio has its own version of almost every primitive from threading.

Notably, they are only important in asyncio code when the critical section includes an await point, in contrast with threaded code where everything is an await point. So that makes them less important, but still of value.

Recursive locks, then, are only important if (a) the critical section must be entered by at most one task at a time; (b) this critical section can be entered by the same task more than once, which isn’t a problem; and (c) there are await points within this critical section. Rare, but definitely possible.

2 Likes

If it’s common for users to write their own recursive lock, perhaps it makes sense to include some links to some third-party libraries providing RLocks in the documentation for asyncio.Lock.

4 Likes