Type annotations for an open wrapper

I’d like to make the type annotations for this function more specific:

def open_with_auto_compression(name: Any,
                               mode: str = "rt",
                               *,
                               ext: Optional[str] = None,
                               **kwargs: Any) -> Any:
    """Open a file, automatically compressing or decompressing it when
       compression is indicated by the file extension.  If 'ext' is
       supplied, it overrides the extension on 'name' (this is useful
       when 'name' is actually a file-like object).  All other arguments
       are passed down to open().
    """
    if ext is None:
        assert isinstance(name, str)
        ext = os.path.splitext(name)[1]
    if ext.startswith("."):
        ext = ext[1:]

    if ext == "gz":
        import gzip
        return gzip.open(name, mode, **kwargs)
    elif ext == "bz2":
        import bz2
        return bz2.open(name, mode, **kwargs)
    elif ext == "xz":
        import lzma
        return lzma.open(name, mode, format=lzma.FORMAT_XZ, **kwargs)
    elif ext == "lzma":
        import lzma
        return lzma.open(name, mode, format=lzma.FORMAT_ALONE, **kwargs)
    else:
        return open(name, mode, **kwargs)

The problem is, first, that what the typeshed says for the built-in open is a gigantic mess that I don’t want to copy into my code, and second, that even small subsets of it don’t work: for instance if I change the return type to IO[Any] then I get

test.py:25: error: Incompatible return value type (got "Union[GzipFile, TextIO]", expected "IO[Any]")

What’s the best way to write accurate type annotations for this sort of function?

Hi Zack,

“What’s the best way to write accurate type annotations for this sort of
function?”

Some might say the best way is not too. Or that Any is the best
that you can do:

“what the typeshed says for the built-in open is a gigantic mess that
I don’t want to copy into my code”

I take it you are referring to this?

I’m sure that it’s not a gigantic mess for fun, or because of
incompetence. It might be that the best way to write an accurate type
annotation for the return result is that gigantic mess.

So effectively you are asking for a way to tell the type checker, this
function can return anything that open can return, plus these things.
Is that right? Essentially you want to extend the existing return type
from open by unioning that with GzipFile etc.

I’m not an expert on Python’s type hinting mini-language. You will
probably get better help from a mypy or typeshed forum. Or look for a
case where an overridden method does something like

if condition:
    return super().method(*args)
else:
    return "something else"

where the something else is of a different type to the overridden
version.

If you do get an answer elsewhere, please write back here to let us
know. I’m curious now :slight_smile:

By the way, you have this in your code:

if ext is None:
     assert isinstance(name, str)

but that’s an unsafe abuse of assert:

  • assertions can be turned off by the caller, which will disable the
    check altogether;

  • even if the assertion is tried, if it fails, it gives the wrong
    exception (an AssertionError instead of a TypeError).

You are attempting to check the type of a public parameter set by the
caller. You should never use assert for that.

You might find this useful:

https://import-that.dreamwidth.org/676.html