I started looking at how dict.get()
is defined in typeshed, and I fell down the rabbit hole of how it behaves with a value type of Any. I wanted to bring this discussion here to get broader feedback.
Given d: dict[str, Any]
, d["key"]
is Any
, and d.get("key")
is Any | None
.
But d.get("key", None)
is Any
, so it’s a bit confusing that being more
explicit gets you a less explicit return type.
However, this is consistent with the behavior of d.get("key", "value")
, which
is also Any
.
I can see the case for both d.get("key")
and d.get("key", None)
to return Any
,
and I think it would be reasonable for both to retturn Any | None
, but the
current state seems like a weird inconsistency.
In typeshed, the current definition is:
class dict(MutableMapping[_KT, _VT]):
@overload
def get(self, key: _KT, /) -> _VT | None: ...
@overload
def get(self, key: _KT, default: _VT, /) -> _VT: ...
@overload
def get(self, key: _KT, default: _T, /) -> _VT | _T: ...
I tested out a new definition in typeshed of:
class dict(MutableMapping[_KT, _VT]):
@overload
def get(self, key: _KT, default: None = None, /) -> _VT | None: ...
@overload
def get(self, key: _KT, default: _VT, /) -> _VT: ...
@overload
def get(self, key: _KT, default: _T, /) -> _VT | _T: ...
This unifies the behavior of d.get("key")
and d.get("key", None)
. It showed
a moderate amount of noise in mypy-primer, almost all of which is the result of
getting Any | None
instead of Any
.
- 111 lines total from mypy-primer
- 25 of these are “note”
- Of the 86 error lines:
- 77 are from
Any | None
instead ofAny
. - 4 are converting an arg-type error to an assignment or return-value error
- 2 are errors from custom subclasses of dict where the new overload didn’t match
(or didn’t match before and now did). - 1 is a new error from returning Any in a situation that returned a concrete type before.
- 1 is a corner case specific to mypy’s proper types plugin. (I won’t discuss this one further here, but I did in this Github comment.)
- 1 is the result of a mypy bug.
- 77 are from
Here are all the scenarios that change if this change is applied. Given this for setup:
from tying import Any
d_any: dict[str, Any] = {}
d_str: dict[str, str] = {}
any_value: Any = None
str_value = "value"
int_value = 1
d_any.get("key", None)
This is the big change. It currently returns Any
, but would return Any | None
with this change.
result: str = d_any.get("key", None)
This is not currently an error. With this change, we get: error: Incompatible types in assignment (expression has type "Any | None", variable has type "str")
result: str = d_str.get("key", None)
This is currently an arg-type error: error: Argument 2 to "get" of "dict" has incompatible type "None"; expected "str"
With this change it becomes an assignment error instead: error: Incompatible types in assignment (expression has type "Any | None", variable has type "str")
def test() -> str:
return d_any.get("key", None)
This is currently a no-any-return error: error: Returning Any from function declared to return "str"
With this change it becomes a return-value error instead: error: Incompatible return value type (got "Any | None", expected "str")
def test() -> str:
return d_str.get("key", None)
This is currently an arg-type error: error: Argument 2 to "get" of "dict" has incompatible type "None"; expected "str"
With this change it becomes a return-value error instead: error: Incompatible return value type (got "str | None", expected "str")
def test() -> str:
return d_str.get("key", any_value)
This not currently an error. With this change, it becomes a no-any-return error: error: Returning Any from function declared to return "str"
Pyright mostly agrees with this. The ones that change from one error to another
don’t do that in pyright, they’re more consistent. The major point of divergence is this:
d_str.get("key", any_value)
Which mypy says is Any
and pyright says is str
. In an unfortunate twist,
this change makes pyright say this is str | None
. Mypy avoids this by evaluating all branches of the overload when Any
is present, but it seems like pyright is handling it differently and taking the first match, giving a result that’s just wrong.
Putting aside that issue for now, what do people think? For d_any: dict[str, Any]
, do we like d_any.get("key") -> Any | None
but d_any.get("key", None) -> Any
? Would d_any.get("key", None) -> Any | None
be better or worse? Should it maybe be that d_any.get("key") -> Any
instead? I’m not well versed in type theory, so I can’t say what the theory-based answer would be.