See also :Why the ** syntax uses keys() instead of __iter__
Methods used internally by operators, in Python, are all dunders described in datamo. With one exception.
When using the unary *
operator, for *args
, Python calls __iter__
. However, for the unary **
operator, for **kwargs
, it calls the method keys
, expected to return an iterable of keys to the mapping (which themselves are expected to be strings, in the case of a function call).
As mentioned in the thread, the keys()
method is also called by dict.update(d)
, and the **expression
syntax can also be found in {**d}
, not restricted to string keys this time.
That makes keys
treated the same way as reserved method/attribute names such as __iter__
, __neg__
or __divmod__
(edited after initial post).
It poses a series of issues, since an object with a keys
attribute or method being mistakenly used as **
(or passed to dict.update, although thatâs probably less likely to happen) would raise a different exception than **object()
would. Worse, if that method were to return an iterable, or a string since they are iterable themselves, the interpreter would lookup these keys with getitem on the object and only then report an error if that doesnât work. And if the keys
method returns an empty iterable, no error is even reported.
No part of the documentation (to my knowledge) warns about this and treats keys
as a reserved name. (see below for a more detailed discussion about this)
Some existing code seems to use keys
not as a dict method :
(1), (2), (3) which inherit google.protobuf.Message which doesnât implement the mapping interface, also they are properties and not methods ;
(4) ;
(5), (6) which implement some of the dict methods but not __getitem__
which is necessary for **
to work correctly ;
(7) which is in the stdlib (!) and doesnât implement __getitem__
either.
I only skimmed over uses of keys as a function, but a class or instance attribute would be just as much problematic and likely even more in use.
I see two solutions to this : changing the behavior to using (removed after discussion) changing the documentation to officially reserve the name.__iter__
, or only
I imagine maintainers will prefer the low-risk second version, but I think the merits of the first deserve to be discussed. I suppose other solutions could be found, I only processed this one but Iâm open to alternatives.
What would be the behavior ? It would call __iter__
, which in the case of mappings is documented as iterating the keys, and test that they are all strings. It would then test for the presence of __getitem__
, and in its absence raise the same exception that it currently does in the absence of keys
and that it now would in the absence of __iter__
. It would then query the values using getitem, and from there on the behavior is the same.
What would be the cases where it avoids an error ? Those I described earlier on, using keys
as an ordinary method or attribute.
What would be the cases where it would result in unintended behavior ? Passing an iterable would not result in the exact same behavior as before, as it would try indexing with the resulting values instead of just raising an error. But if even one of the iterated values is not a string, the operation would still fail, presumably before index lookup. And even if all the keys are strings, indexing with them will not work, raising an IndexError (or a TypeError for non-sequence iterables). It would let **()
pass, though, as well as other empty iterables. But I find this to be more acceptable behavior than the current one. It would only leave an unintended behavior pass without exception in the case of a class with __iter__
and __getitem__
but without a mapping behavior, and that doesnât sound like something deserving official support.
We should describe keys
as a reserved method name, it should also be flagged by type checkers and linters when defining a method taking any other signature than just (self)
, and if possible in classes defining keys but not the other mapping methods : __iter__
and __getitem__
. But that may be hard to do if and when you donât have full view of the base types.
I know PSF/PSC doesnât enact type checker changes, but better to say it anyway. And an official statement that âkeysâ is reserved may help checkers and linters take it into account.