Type checkers complain about io.text_encoding()

I’m sure I am doing something silly, but neither mypy nor pyright recognise that io.text_encoding() exists. Any pointers?

% python -c "import io; io.text_encoding(None)"
% mypy -c "import io; io.text_encoding(None)"  
<string>:1: error: Module has no attribute "text_encoding"  [attr-defined]
Found 1 error in 1 file (checked 1 source file)

This is because it’s not included in typeshed’s stubs for the io module: https://github.com/python/typeshed/blob/main/stdlib/io.pyi

Typeshed is the “single source of truth” for all major type checkers when it comes to the stdlib. Type checkers don’t look at the runtime code in CPython at all; as far as a type checker is concerned, if a function isn’t included in typeshed’s stubs, it doesn’t exist.

In general, we only add stubs for public module members in typeshed, unless somebody explicitly asks for us to add a private module member for some reason. In this case text_encoding isn’t included in io.__all__ at runtime, so our tests probably assumed that it was a private module member that was deliberately not being included in the stubs. (In general, our tests fail if a public module member at runtime is missing from the stubs.) However, it looks like the function is documented even though it’s not present in __all__ – so we should probably add it to typeshed! PR welcome :slight_smile:

The fact that text_encoding is a documented, public module member that is not present in io.__all__ is possibly a separate bug in CPython: it should possibly be added to io.__all__

1 Like

Thanks @AlexWaygood! The type signature is fairly trivial, except that I don’t know if there’s a better way to hint that encoding is returned identically if not None:

def text_encoding(encoding: None, stacklevel: int, /) -> Literal["locale", "utf-8"]:
def text_encoding(encoding: T, stacklevel: int, /) -> T:
def text_encoding(encoding: Any, stacklevel: int = 2, /) -> Any:

That looks about right! One or two things that need to be tweaked, but nothing major:

  1. Typeshed only has stub files, so there’s no need to include a “runtime implementation”: just the two overloads will suffice.
  2. Nearly all TypeVars are private in typeshed, so that it’s clear that these are “implementation details of the stub” that can’t be imported at runtime
  3. We still support Python 3.7 in typeshed, so you’ll need to use the pre-PEP-570 method of specifying positional-only arguments in stub files, as laid out in PEP 484: prepending parameter names with __

Putting that all together, the typeshed signature should probably look like this:

from typing import TypeVar, overload
from typing_extensions import Literal

_T = TypeVar("_T")

def text_encoding(__encoding: None, __stacklevel: int = 2) -> Literal["locale", "utf-8"]: ...
def text_encoding(__encoding: _T, __stacklevel: int = 2) -> _T: ...

Great, thanks a lot for the help! Although I will refrain from opening a PR, since you wrote the entire thing :slight_smile:

Just out of curiosity, I suppose that means there is no need and no way to annotate that the second overload returns the very same object it is given?

1 Like

Yeah – type checkers don’t really care about whether or not a function returns “the same object” it was given – they only care about whether a function returns “the same type” as the type of an object passed in

Sure – I created Add `io.text_encoding` on py310+ by AlexWaygood · Pull Request #10929 · python/typeshed · GitHub :smile:

1 Like