Scoped context manager to avoid leaking with targets

I can’t find a good explanation of why this is a terrible idea, so I thought I’d try to find one here.

Context managers don’t create any scope, so the variables created with the context, leak out of the context. This is probably useful, but it means that linters and other static checkers don’t do a good job of noticing the misuse of variable created in the context.

For example:

with open('somefile.txt') as f:
    pass
data = f.read()  # ValueError

I understand that the context manager is syntactic sugar for something like:

try:
    f = open(...)
    <body of the context manager>
finally:
    f.close()

So of course f exists after the context manager.

I’m imaging instead:

with_scoped open(...) as f:
   pass
f.read() # NameError: name 'f' is not defined

Which would be syntactic sugar for:

try:
    f = open(...)
    <body of the context manager>
finally:
    f.close()
    del f  # now we don't leak f

This might make it easier for linters and other static tools to notice the problem before code ships, rather than noticing the problem because resources are leaking.

4 Likes

Hello, Oren! :slight_smile: Thank you for sharing the idea!

Personally, I don’t think it’s terrible. Quite contrary: it’s interesting. I believe, however, that the possibilities such construct would provide are too narrow to be worth introducing a new syntax.

Thanks for not trashing the concept!

I suppose there’s already a way for a linter to detect the issue, but I like the idea of making an otherwise subtle bug really obvious.

Maybe there’s a cleaner way to achieve the effect?

It can seem a bit surprising that context target is part of the broader scope. It’s not clear to me when you would want to have that happen, maybe to inspect stats about the target?

Maybe rather than adding a new keyword, hijacking the type annotation syntax could work?

with open(...) as f[del]:

That seems wrong, but maybe there’s something in that direction?

with open(...) as del f:

It’s concise and would be compatible with any currently correct code. It would also allow:

with (
   A() as a,
   B() as del b,
   C() as c
):

There’s a sort of circularity here–if you’re prone to writing this bug, are you going to think of using scoped_with in the first place[1]? If you think of scoped_with while writing your code, you’re probably not going to make this mistake at all.

The existing semantics can be quite useful, so they’re not going anywhere. I’m not sure introducing a new keyword helps solve the problem.


  1. or however it might be spelled ↩︎

4 Likes

I disagree with you, in that the issue isn’t that the code has the bug when it’s first written, but that the bug shows up at some point when code gets changed. I’d like to make it obviously incorrect to introduce the bug, rather than “huh, we’re leaking a resource, better track it down!”

1 Like

It seems like this might be something they’ll need to address, but it could help if we make them aware.

The way we most recently hit the bug is that a context has many lines of code and a line of code got moved.

Fixing the destructor to do a better cleanup is probably the right solution for us in the short term.

As far as runtime behavior goes:

class Scoped:
    def __init__(self, cm):
        self._cm = cm
        self._cv = None

    def __enter__(self):
        self._cv = self._cm.__enter__()
        return self

    def __exit__(self, *args):
        ret = self._cv.__exit__(*args)
        self._cv = None
        return ret

    def __getattr__(self, name):
        if self._cv is not None:
            return getattr(self._cv, name)
        raise RuntimeError("Attempting to access context manager outside with block")
>>> with Scoped(open('/dev/zero', 'rb')) as f:
...    fivebytes = f.read(5)
>>> print(fivebytes)
b'\x00\x00\x00\x00\x00'
>>> f.read(2)
Traceback (most recent call last):
  File "<python-input-12>", line 1, in <module>
    f.read(2)
    ^^^^^^
  File "<python-input-5>", line 17, in __getattr__
    raise RuntimeError("Attempting to access context manager outside with block")
RuntimeError: Attempting to access context manager outside with block

As far as linters go, I think they could write rules to catch this case without it being a runtime NameError.

3 Likes

That’s a nice solution.

My position can be summarized as:
This is how the language works. An alternative form, in which the scope and lifetime of a context manager assignment via with...as, would have been a perfectly reasonable alternative design. But it’s not worth having both in the language.

I think you’re making an implicit assumption here that when a context manager is closed, it is also in some sense “finalized”. That’s typical in my experience, but it isn’t required.

For example, there may be multiple layers to tear down inside of a resource. Perhaps the inner layer has implicit teardown on exit, but the outer layer is kept alive, allowing for

with ConnectionPool(...) as conn:
    conn.send("hello")
with conn:
    print(conn.read())

Or, maybe a more blunt and direct answer: yes, you might want to access a closed resource after the context manager exits.
Not common, but not something I’d be interested in forbidding.

I’d say this is worth your while regardless, and even if the rules of with statements changed.
Consider:

handle = get_handle()
with handle:
    ...

So your reusable abstractions need to be well behaved if used after close (which can be an error, that counts as well behaved!) regardless.


Anyway, I don’t hate the idea but I just don’t see it as a good lint rule because it seems like it would mostly flag the subtle-but-correct cases.

I might implement it in a script just to see how many hits I would get at $WORK and over the stdlib. If I do that and find anything I’d interest, I’ll come back to share.

2 Likes

I guess another question for this idea is: why not just define a function? Then you have function scope, and your excessively-large context is refactored into a separate location where it’s easier to reason about.

4 Likes

Yes, the “leakage” is useful for checking the state of the context target after all of its primary tasks are completed.

An example from the documentation of unittest.mock.patch.object:

with patch.object(ProductionClass, 'method', return_value=None) as mock_method:
    thing = ProductionClass()
    thing.method(1, 2, 3)

mock_method.assert_called_once_with(1, 2, 3)

And another example from the documentation of contextlib.redirect_stdout:

with redirect_stdout(io.StringIO()) as f:
    help(pow)
s = f.getvalue()

And from the doc of urllib.request.urlopen:

import urllib.request
DATA = b'some data'
req = urllib.request.Request(url='http://localhost:8080', data=DATA, method='PUT')
with urllib.request.urlopen(req) as f:
    pass
print(f.status)
print(f.reason)

That a context manager can capture an exception makes the context target also useful for checking what’s captured after an exception occurs in the context body.

Example from unittest.TestCase.assertRaises:

with self.assertRaises(SomeException) as cm:
    do_something()

the_exception = cm.exception
self.assertEqual(the_exception.error_code, 3)
5 Likes

Thanks for the examples!

Thank you, I genuinely appreciate your perspective. I’m persuaded.

The one case I recently ran into in the code base I’m working on might have been helped by the alternative construct I proposed, but there are enough other ways to cause the resource to leak that fixing up the destructor to properly close if has been accidentally reopened seems like the right move.

I’m dealing with fairly complex queries in sqlalchemy. It’s common for us to end up with queries that are dozens of lines long, so it doesn’t take much to end up with a context that is a few screens long.

I think having something like with open(…) as scoped f would be a cool idea.

Like others have mentioned, it should be on an opt-in basis, since in many cases the context manager is useful after the context manager exits. (Also obviously backwards comp.)

Linters can’t help here without a sanctioned way of opting into a scoped CM. Maybe it could be done without language support, using something like the proposed Scoped, but I’ve never seen something like this happen.

But Python doesn’t have a block scope.

What output do you think the following code should produce?

f = None
with open(...) as scoped f:
    ...
print(f)
1 Like

The same output as in

f = None
[f for f in [1, 2, 3]]
print(f)

:wink:

2 Likes

Good point. Comprehensions used to be compiled as separate functions where iteration targets were truly local. But comprehensions are now inlined, with existing values of iteration targets saved in the intrepreter stack and restored after the comprehensions finish, effectively making the iteration targets block-scoped. An implementation of a proposed “scoped” target can do the same.

Furthermore, the idea may be more plausible if generalized for all assignment targets so the scoped soft keyword can be used in assignments, for loop targets and possibly capture patterns, to declare the specified names as block-scoped.

Very interesting idea! In my experience type checkers (or maybe just Mypy) are somewhat weak with possibly unbound names from prior scopes etc. I’ve been bitten by this more than once. The ability to more precisely control scope in most (all?) block-like constructs would be cool, although probably an expert-level feature.

Something like

for scoped x in <iterable>:
    ...
2 Likes