Currently “file-like objects” in Python is perhaps the most extreme use case of a duck typing system. A function can take any object as a “file” or “stream” argument as long as the object happens to have the set of “file-like” methods required by the function.
While such a design is convenient and versatile to use in a small project, it makes it difficult to tell in a large project exactly which methods need to be implemented for a file-like object in order to satisfy a particular function.
@Jelle, the developer of the typing module, has addressed the issue in this discussion:
And indeed, a look at _typeshed/__init__.pyi does reveal a good variety of protocols covering most of the file-like methods:
But the problem is that firstly, it’s a .pyi file meant for stubs for type checkers, and is therefore not directly importable.
And secondly, even if these small protocols are made available (by copying and pasting the code from the .pyi or by maintaining those small protocols ourselves), it would still be clumsy to use, having to define a dedicated Protocol just to type hint a file argument of a particular function:
class FooFile(SupportsRead, SupportsNoArgReadline):
pass
def foo(file: FooFile):
if (first_line := file.readline()).startswith('#!'):
return first_line + file.read()
And then those who use a type checker then needs to find the definition of FooFile in order to understand that foo expects a file-like object that provides read and readline methods.
Wouldn’t it be more convenient and clearer to allow type hinting with an ad-hoc intersection of protocols in this case?
You can directly import many of these protocols from GitHub - hauntsaninja/useful_types: Useful types for Python . There are folks working on a draft proposal to add intersections to the type system, I agree that easy intersection of protocols is a great use case.
My understanding is that _typeshed is “technically” an implementation detail for type checkers, but that code protected by if TYPE_CHECKING is similarly “lifted” into the type checker (since it’s there for use by the type checker, not your script at runtime).
Thanks. The problem is that doing this only helps provide type checking in type checkers, while producing a NameError at runtime because TYPE_CHECKING being false at runtime leaves SupportsRead undefined.
After some experimentation I found a workaround that works both for type checkers and at runtime:
try:
from _typeshed import SupportsRead, SupportsWrite
except ModuleNotFoundError:
from unittest.mock import Mock
SupportsRead = SupportsWrite = Mock()
class SupportsReadWrite(SupportsRead, SupportsWrite):
pass
object was actually the first thing I tried too, but it would produce:
TypeError: duplicate base class object
And if I did:
SupportsRead = SupportsWrite = object()
I’d get:
TypeError: object() takes no arguments
So I figured Mock() was the most convenient object because it can be called with any arguments, although one can also do:
class M:
def __new__(*args):
return object.__new__(M)
try:
from _typeshed import SupportsRead, SupportsWrite
except ModuleNotFoundError:
SupportsRead = SupportsWrite = M()
class SupportsReadWrite(SupportsRead, SupportsWrite):
pass
At any rate this feels like an ugly workaround, having to repeat the name of every protocol in use. Better use @hauntsaninja’s useful_types even though it’s an additional dependency (or make it part of stdlib maybe?).