TextIOWrapper probably shouldn't close its stream when GCed

I was recently surprised by the fact that the io.TextIOWrapper class closes its underlying stream when garbage collected. I assume this behavior is there simply because it inherits from IOBase which provides a default implementation of __del__ that does this. However, in my opinion, this behavior doesn’t make sense. The name of the TextIOWrapper class makes it explicitly clear that it’s a wrapper and therefore does not own the stream. This makes its __del__ implementation redundant with the implementation provided by the underlying stream. It also makes certain refactorings behave in unexpected ways (cf. my initial confusion in cpython#130965).

Note that this isn’t about closing the wrapper propagating to the underlying stream in all cases. There are what I consider to be legitimate uses for that, e.g.:

with io.TextIOWrapper(open('file', 'mode')) as f:
    # do something with f
# when back out of the with block, the file handle is closed too

This also isn’t about __del__ behavior in all classes, either. That’s another topic entirely; this is just about the behavior of a wrapper class. Using GC to manage non-memory resources is a thorny topic and I’m not trying to wade into it.

In short, I think TextIOWrapper.__del__ specifically should be a no-op.

3 Likes

The destructor not only calls close(), it also emits a ResourceWarning: it’s a bad practice to rely on the GC to close a file. You should close it explicitly. Using a context manager is a good solution for that (with open(...) as fp: ...).

Use python -Wdefault or python -X dev to see ResourceWarning. It’s hidden by default.

3 Likes

I think the current way to resolve the problem with your example would be:

def handle_file(out_file):
    out_writer = io.TextIOWrapper(out_file)
    out_writer.write("hello,world")
    out_writer.flush()
    out_writer.detach()

with tempfile.TemporaryFile() as out_file:
    handle_file(out_file)
    out_file.seek(0)
    print(out_file.read())

So you could use a simple context manager:

from contextlib import contextmanager
import io

@contextmanager
def detaching(filelike):
    wrapper = io.TextIOWrapper(filelike)
    yield wrapper
    wrapper.detach()

def handle_file(out_file):
    with detaching(out_file) as out_writer:
        out_writer.write("hello,world")
        out_writer.flush()

But it would make sense to me that __exit__() closes and __del__() detaches. I’m not sure that could be changed now without causing more pain than it relieves.

2 Likes

Yes, I agree. Indeed, my argument here is to not have the resource closed by the GC at all.

Yes, calling detach solves the issue I had before. But I still find the way things work now unintuitive.

I can’t see a situation where changing it would break existing code: TextIOWrapper.__del__ calling close will (always?) be redundant with the wrapped stream’s __del__ method doing the same thing.

3 Likes

I’ve been bitten by this behavior as well and had to write a NonClosingTextIOWrapper subclass to avoid it.

2 Likes

The issue with not doing close() is there may be unwritten metadata (ex. gzip.GzipFile) and / or un-flushed data / data loss (ex. BufferedWriter). In the case of gzip, the crc and size fields are only written on .close() / just .flush() isn’t sufficient.

In 3.12, a move to buffering in gzip which wasn’t closed, fortunately caught in beta (see: gh-105808), would result in data loss. Definitely frustrating to need to change code, but one of these three paths should work and help future proof:

  1. Use the file-like object as a “With Statement Context Manager”.
  2. Ensure .close() is always called which flushes data before closing.
  3. If the underlying stream need to be kept open, use .detach()

Do you have any cases where those don’t work?


extra special case: If an fd passed to or opened by the builtin open() shouldn’t be closed, can pass closefd=False.

1 Like

Ok, I think you do raise a potential issue.

A TextIOWrapper has a buffer, which might contain data that still needs to be written. This buffer ought to be flushed explicitly, but to keep things safe and sane, it will be flushed implicitly if needed when closed, including when closed by a context manager (if any) or, as a last resort, the garbage collector.

I think this might be a valid reason to keep the current behavior. At the least, changing TextIOWrapper.__del__ to a no-op could result in breaking some code that already exists (even if it is improperly written). I think this risk is substantially lower for TextIOWrapper than say GzipFile and other examples, but that’s a gut feeling not supported by any data.

To respond to your specific question, detach is indeed the solution for cases where a stream needs to outlive a TextIOWrapper that’s wrapping it. This is not immediately apparent from the documentation, though. Maybe the right action to take here is to clarify in the class docs that detach is needed to keep the underlying stream alive.

1 Like