Interlacing ContextManager

zhangyx · October 22, 2024, 8:29pm

As discussed in this StackOverflow Post, I ran into trouble getting interlacing locks to work properly with KeyboardInterrupt. (I am aware of signal masking, but would rather not use it for greater compatibility).

I realized that the logic I tried to implement might be currently impossible in python. I’ve enumerated all possible approaches to this problem. The closest (Approach 6 from my own answer) still differs from the expected behavior and induces extra & unnecessary lock/unlock operations.

That’s why I an proposing a new language feature - interlaced context manager. The proposed logic is shown below:

┌── with CTX1 ──┐
╵               ╵
task_A --- task_B --- task_C
           ╷               ╷
           └── with CTX2 ──┘

I am aware of the challenge given how codeblocks are organized in modern programming languages.
I wish this post could bring some attentions from the community and hopefully someone will come up with an ingenious solution.

da-woods · October 22, 2024, 9:14pm

Using locks like this is generally a bad idea. Suppose

thread A already owns lock 1 and is now trying to acquire lock 2.
thread B already owns lock 2 and is now trying to acquire lock 1.

You have a dead-lock and are stuck forever.

One general approach is that you always acquire the locks in a strict nested order (so always go lock 1->2).

Alternatively there are algorithms to acquire multiple locks at once without the possibility of deadlock. But ultimately you have to be prepared to release a lock to solve the problem. Your SO question implies that you are unwilling to do this.

So let’s not add language/library features which encourage fundamentally broken code.

zhangyx · October 22, 2024, 9:56pm

The topic starts off from lock handling, but my point is not about lock handing itself, but a missing capability of the language.

BTW, I am aware of your aforementioned case and have carefully examined that it will not happen in my specific implementation.

I feel offended that you said my code is “fundamentally broken” just because the use of interlacing lock. Are you so sure that you would NEVER need to use it?

I’ve provided more detailed pseudo code demonstrating why it’s needed in my case below.

pf_moore · October 22, 2024, 10:08pm

Even if we ignore the example of locking, context managers are by design intended to nest, not interleave. They are designed as a solution to the problem of “last in, first out” management of resources.

If you want interleaved management of resources, then you are correct, there’s no language construct to handle this. That’s not a problem, as you can manually manage resources any way you like. But you’re unlikely to get support for a language construct^[1] to handle this because:

It’s a rare need, nowhere near as significant as “last in, first out”.
It’s generally considered to be an anti-pattern. The example of locks is an obvious case, but not the only one.
The semantics don’t align well with Python’s indentation-based nesting structure.

It may be that some form of library-based mechanism (not needing language support) could help with this sort of resource management scenario - but as it’s library-based, such a solution could be developed and published on PyPI. There’s no reason it has to be part of the core or stdlib.

Which would be a new construct, not context managers, as they serve a different purpose as I said ↩︎

zhangyx · October 22, 2024, 10:18pm

@da-woods
Here is an example I constructed. I’ve tried my best to abstract away irrelevant details while retaining the idea. Hope it makes sense to you.

If you still think my code is “fundamentally broken”, I would be really curious how you would implement this logic in a way that is NOT fundamentally broken according to your standards.

lock1 = Lock() # Guards observation/assignment of `task`
lock2 = Lock() # Guards        execution       of `task`
task: callable # Calling this might acquire lock1 and change `task`

def do_task():
  lock1.acquire()
  task_snapshot = task
  lock2.acquire() # Execution lock acquired before observation lock released
  lock1.release()
  task_snapshot() # It might acquire lock1 and change `task`
  lock2.release()

def bad_example_1():
  with lock1:
    task_snapshot = task
  # Bad: another thread might call bad_example_1() and execute the task.
  # This will cause task_snapshot to be outdated (i.e. not the current task)
  with lock2:
    task_snapshot()

def bad_example_2():
  with lock1:
    with lock2:
      task() # Bad: lock1 not released, updating task from inside will cause deadlock

Desired pattern (suppose the 2nd execution of task1 updates the task):

task1, task1, task1, task2, task2, ....

Bad pattern:

task1, task1, /task2/, task1, task2, ....

zhangyx · October 23, 2024, 12:05am

Thanks for providing insights on this! I did realize that this code pattern conflicts with the well established indent-based coding paradigm in python. That’s why I only provided a control flow chart instead of code snippet.

However, I don’t think lock handling is the only case when this is needed.
For example:

# suppose a.txt has 1M lines, b.txt has 10M lines and c.txt as 9M lines
# line counts are unpredictable for each file

with open("a.txt") as a, open("b.txt") as b, open("c.txt") as c:
    yield from zip(a, b) # This will drain a.txt

    active, drained = check_drained(a, b)
    # active = set{b}, drained = set{a}
    
    # holding a.txt open does not make any sense now

    if len(active):
      # c.txt is not needed till now
      yield from zip(*active, c)

Suppose we have an without keyword that (1) calls __exit__() on its arguments and (2) skipps corresponding __exit__() calls upon exiting the codeblock, it will solve the first half of the problem:

with open("a.txt") as a, open("b.txt") as b:
    yield from zip(a, b)

    active, drained = check_drained(a, b)
    without drained # context-managed object or Iterable of them

    if len(active):
      with open("c.txt") as c:
        yield from zip(*active, c)

It does work with existing contextmanager, but it holds the resources even if it’s no longer (or not yet) needed. I guess this problem has been there for a while but has long been ignored.

I do understand that this alone does not justify for a new feature. I am wondering how many people needs this kind of optimization but was too scared/busy to implement manual context management.

gcewing · October 23, 2024, 12:20am

I realized that the logic I tried to implement might be currently
/impossible/ in python.

That’s why I an proposing a new language feature - interlaced context
manager.

I don’t think this will solve your problem. Context managers aren’t
magic, they’re just shorthand for a try block. If you can’t write some
combination of try blocks that does what you want, you won’t be able to
do it with context managers either, whether interlaced or not.

I think you’ve already diagnosed the fundamental problem – you need to
be able to mask KeyboardInterrupts. Without that ability, you can’t be
sure that you won’t get a KeyboardInterrupt at some inconvenient time
like in the middle of an except or finally clause, or while a context
manager is executing its enter or exit method.

If a language feature is to be added to address this, I think it should
be a platform-independent way to mask asynchronous exceptions.

Regarding your comment about restoring signal masks, the way this is
usually handled in unixes is that the system call that sets the signal
mask also atomically returns the previous mask, so that it can be
restored later. I don’t know how this kind of thing is done in Windows,
or if it’s even possible to mask keyboard interrupts at the OS level at all.

However, a Python feature for masking asynchronous exceptions wouldn’t
have to do it at the OS level. It’s already catching SIGINT and setting
a flag that gets checked in the interpreter loop. There would just have
to be another flag saying not to perform that check yet.

zhangyx · October 23, 2024, 12:29am

Yeah I did some researches just now and also realized this problem:

class Lock:
   ... # omitted code

   def __exit__(self, *_):
      # SIGINT might happen here, lock might never get released
      self.release()

I am not sure whether CPython has additional protections around this potential issue (or even if this is considered an issue at all). If not, I guess another proposal I would have is to have a new keyword atomic that does the following:

class Lock:
   @atomic # defers all exceptions until exits from current scope
   def __exit__(self, *_):
      self.release()

And:

with atomic:
    lock.release()
    flag = False

I am also wondering if SIGINT will interrupt lock.acquire() and leave the mutex in a nondeterministic state.

gcewing · October 23, 2024, 12:40am

Essentially nothing is safe if you’re concerned about asynchronous exceptions.

gcewing · October 23, 2024, 12:43am

I would hope not – the locking primitives would be seriously broken if that were possible.

ncoghlan · October 23, 2024, 1:07am

The tool to reach for when the context management flow doesn’t align with the code block structure is contextlib.ExitStack

~~For the interleaving case:~~

release_later = contextlib.ExitStack()

with first_resource:
    do_something()
    release_later.enter_context(
        second_resource
    )
    do_combined_thing()

with release_later:
    do_something_else()

Edit: this sequential structure isn’t right, see Interlacing ContextManager - #13 by ncoghlan

zhangyx · October 23, 2024, 1:22am

Thanks! This solves my original problem.

I do have one additional question for ExitStack: does it propagate exceptions into the enclosed context manager? And in case control flow fails to reach the second with statement, will second_resource.__exit__() be executed?

For demonstration:

release_later = contextlib.ExitStack()

with first_resource:
    do_something()
    release_later.enter_context(
        second_resource
    )
    # Suppose KeyboardInterrupt raised here,
    # will second_resource.__exit__() be called?
    # If so, will second_resource.__exit__() receive KeyboardInterrupt?
    do_combined_thing()

# Suppose KeyboardInterrupt is propagated up, which will skip the 2nd with block
# Will second_resource.__exit__ ever be called?

with release_later:
    do_something_else()

ncoghlan · October 23, 2024, 1:57am

Good point, the context managers should be nested to ensure the second resource is cleaned up promptly rather than when the exit stack is garbage collected:

with contextlib.ExitStack() as outer_with:
  with first_resource:
    do_something()
    outer_with.enter_context(
        second_resource
    )
    do_combined_thing()
   do_something_else()

Edit: for the other part of your question, context managers added to the stack receive exception details and can suppress exceptions as normal (otherwise they wouldn’t work as expected), but plain callbacks do not.

da-woods · October 23, 2024, 4:42am

Sorry. That was a very poor choice of words from me.

bad_example2 could work if lock1 is a threading.RLock (so can be acquired multiple times by the same thread).

I’m not sure I understand the problem well enough though. I can’t see a way to do your desired implementation without deadlocks which either means I’m missing the point, or that there’s deadlocks.

I think you got a more useful answer to what you were really asking, so maybe I should stop worrying about the locking in an obviously simplified example.

ncoghlan · October 23, 2024, 6:38am

Avoiding deadlocks only requires that “resource A” and “resource B” always be acquired in the same order.

It doesn’t specify anything about the order in which they’re released, except that if resource A is released first, it cannot be reacquired until after resource B has also been released (since reacquiring A at that point would be equivalent to acquiring resource B first, and hence be vulnerable to causing deadlocks)

So this is OK (since A is always held before acquiring B):

    with resourceA:
        # Do something with only A
        with resourceB:
            # Do something with both A and B
        # Do something else with only A
        with resourceB:
            # Do something else with both A and B
        # Do some final things with only A

And this is OK (since A is still held when B is acquired):

    with contextlib.ExitStack() as cm:
        with resourceA:
            # Do something with only A
            cm.enter_context(resourceB)
            # Do something with both A and B
        # Do some final things with only B

But this is vulnerable to deadlocking:

    with contextlib.ExitStack() as cm:
        with resourceA:
            # Do something with only A
            cm.enter_context(resourceB)
            # Do something with both A and B
        with resourceA:
            # Deadlock risk!
            # What if another thread is already blocking
            # on the `enter_context` call above?
        # Do some final things with only B

First-in-first-out can potentially be a reasonable pattern for things like state ownership transfers which require some setup in the sending object and finalisation in the receiving object: the setup and finalisation steps only keep the object they actually need locked, with the lock on the sending object being dropped as soon as the state has been transferred, even if the consequences are still being applied.

It’s not transactional though, since you can’t undo the changes on the sending side if the finalisation on the receiving side fails. I assume being able to clean up errors more reliably is the main reason the first-in-last-out resource management approach is more common (especially in Python), with the FIFO approach being reserved for cases which need to eke out the last scrap of potential parallelisation from an algorithm.

da-woods · October 23, 2024, 6:55am

Thanks - that matches what I was trying to say, in more detail.

My reading of the problem statement is that they’re directly asking for something like your final example. But it could also be me misunderstanding the problem statement (either because something’s been left out of the simplified example, or just because I’ve misread it).

ncoghlan · October 23, 2024, 8:03am

I read the request as being for the second example (acquire A, acquire B, release A, release B), which is unusual (since you can’t undo the changes to A if something goes wrong with the application of the corresponding changes to B), but not inherently wrong.

zhangyx · October 23, 2024, 7:55pm

@gcewing @ncoghlan

Regarding interrupt safety of Python’s own Lock implementation, I found some interesting facts:

1. `Lock().enter()` and `Lock().exit()` seems to be internal C bindings that are immune to interrupts.

2. Wrappers around the lock (including Python’s own `Condition` variables) are vulnerable to `KeyboardInterrupt`.

Here is the code I used for test:

from threading import Lock, Condition
from sys import stdout, stderr

class WrappedLock:
    def __init__(self):
        self._lock = Lock()

    def locked(self):
        return self._lock.locked()

    def __enter__(self):
        return self._lock.__enter__()

    def __exit__(self, *args):
        # SIGINT here will cause unreleased lock (essentially a deadlock)
        return self._lock.__exit__(*args)

lock1 = Lock()
lock2 = WrappedLock()
lock3 = Lock()
cond3 = Condition(lock3)

try:
    while True:
        with lock1, lock2, cond3:
            # Ask dispatcher to send SIGINT
            stdout.write("\n")
            stdout.flush()
except KeyboardInterrupt:
    pass

if lock1.locked():
    stderr.write("lock1\n")
if lock2.locked():
    stderr.write("lock2\n")
if lock3.locked():
    stderr.write("lock3\n")

I used a custom script to run this test in batches. You can find the code in my Github Gist.

I ran only 10000 tests because it already made my MacBook’s fan roaring at me. Please feel free to run more tests on a more powerful machine. I am also curious if this result is reproducible on other OS/Architectures.

Here are the results:

# Python 3.12.5 on MacOS
Total 10000 tests                                                                                                                                                                                                                                                                                                                                                                       
lock1 failed 0 times (0.00%)
lock2 failed 430 times (4.30%)
lock3 failed 514 times (5.14%)

This result shows that ContextManager works (and ONLY works) with raw, unwrapped Python Lock primitives to provide robust interrupt handling. I guess this will cause some very intricate confusions if someone assumes ContextManagers are always immune to interrupts.

gerardw · October 24, 2024, 1:13am

# Python 3.12.2 Ubuntu 20 
# Architecture:                       x86_64
# CPU(s):                             32
# Model name:                         AMD EPYC 7F52 16-Core Processor
# CPU MHz:                            3493.355

Total 1000000 tests                                                                                                                                            
lock1 failed 0 times (0.00%)
lock2 failed 69387 times (6.94%)
lock3 failed 61470 times (6.15%)

gcewing · October 24, 2024, 5:24am

Thinking a bit more about how to make it possible to write code that’s robust against KeyboardInterrupts, I think what needs to happen is for asynchronous exceptions to be automatically masked during except and finally blocks.

I’m fairly sure this has to be built into the behaviour of the try statement. If you have to explicitly call something to mask the interrupts, then you’re prone to an interrupt happening just after you enter an except or finally block but before you get a chance to do the masking:

try:
    ...
finally:
    # KeyboardInterrupt here and you're screwed
    disable_async_exceptions()
    clean_things_up()
    enable_async_exceptions()