As we all know, I/O is a particularly difficult part to type, especially safely. When the typing module was introduced, the IO
, TextIO
, and BinaryIO
classes were supposed to represent “file-like” objects. Unfortunately, the introduction of these classes precedes protocols, so they are concrete classes that need to be sub-classed for other classes to be considered compatible with them. They are also fairly broad and many I/O classes don’t fully implement the required protocol, sometimes leading to unsafe calls.
In typeshed we have defined a few fairly tight protocols to alleviate these problems and we aim to use these tight protocols if possible. I also encourage library authors to do the same. Still, they are sometimes a bit clunky to use (especially since we don’t have a convenient method to compose protocols in type annotations) and they have a discoverability problem.
Therefore I suggest we add two fairly simple, fairly tight protocols to the typing module that will probably be good enough for 90% of use cases (number entirely made up) where IO
, BinaryIO
, and TextIO
is currently used, a reader and a writer class. Something along the line of this:
@runtime_checkable
class Reader[AnyStr](Iterable[AnyStr], Protocol):
def read(self, n: int = ..., /) -> AnyStr: ...
def readline(self) -> AnyStr: ...
@runtime_checkable
class Writer[AnyStr](Protocol):
def write(self, s: AnyStr, /) -> int: ...
This is not a final proposal, and we’d need to put a bit more research into which methods are most used in practice, but just to give an idea. This splits the tasks of reading and writing (since consumers of file-like objects will usually do either but no both), and leaves out the more esoteric features like file seeking and physical file management, including closing files. These are still available for IO
and its sub-classes or more specific protocols.
I think this could reduce a lot of the problematic uses of IO
etc. and would be a big step forward for the safe, easy-to-use typing of I/O in Python.