Singledispatch on arguments that are themselves types/classes

smheidrich · December 28, 2022, 10:31pm

functools.singledispatch doesn’t currently allow dispatching on arguments that are themselves types/classes. E.g. if we try to do this:

from functools import singledispatch

@singledispatch
def describe(x) -> str:
  raise TypeError(f"no description for {repr(x)}")

@describe.register(type[int])
def _(x: type[int]) -> str:
  return "the integer type"

print(describe(int))

we only get back an exception:

Traceback (most recent call last):
  File "example.py", line 7, in <module>
    @describe.register(type[int])
  File "/usr/lib/python3.10/functools.py", line 862, in register
    raise TypeError(
TypeError: Invalid first argument to `register()`: type[int]. Use either `@register(some_class)` or plain `@register` on an annotated function.

For completeness’s sake, not much changes if we try to leave out the argument for register and let singledispatch use the annotation alone:

Traceback (most recent call last):
  File "example.py", line 8, in <module>
    def _(x: type[int]) -> str:
  File "/usr/lib/python3.10/functools.py", line 873, in register
    raise TypeError(
TypeError: Invalid annotation for 'x'. type[int] is not a class.

I would like to propose that this should be possible, considering we can use type[X] (or Type[X]) annotations to refer to a subtype of X in other contexts. This would just make the behavior of singledispatch consistent with that.

I hope this consistency argument alone is convincing enough, but if anyone insists, I can post some (admittedly fairly far-fetched) use cases for this.

I also don’t think it would be particularly difficult to implement. A proof-of-concept patch that makes the example above work can be found here, although it would need some work to make it play nicely with Union types and Type instead of type, figure out if we really want to use type[X] for register’s argument as well or come up with a solution that doesn’t require us to put an annotation into “actual” code, optimize it, make it look nicer, and so on. But I’d be willing to look into all that and turn it into a proper PR if there is an agreement that this would be a good feature to have and I didn’t miss any reasons why this can’t actually work.

smheidrich · December 30, 2022, 2:45pm

I’ve gone ahead and created a GitHub issue: singledispatch on arguments that are themselves types/classes · Issue #100623 · python/cpython · GitHub

Because I included one of the use cases I had in mind there, let me also re-state it here in case anyone wants to discuss:

I often use singledispatch to define generic functions that transform (data)class instances to various representations of the contained data, e.g. to_json(obj), to_terminal_output(obj), and so on, the advantage over methods being that the classes themselves can be kept relatively “clean” and not concerned with the details of all these different formats. Naturally the question arises whether we could have similar functions for the inverse case, e.g. a generic function from_json(...) that can transform JSON back into any (data)class instance for which an implementation is provided. But how should we tell the generic function which class we want to deserialize to? If we want to stick with singledispatch, the natural way would be to simply have the class itself as the first argument (from_json(klass, json: str)) which is not currently possible as demonstrated in the example above.

storchaka · December 30, 2022, 3:06pm

I think that without making both isinstance(int, type[int]) and issubclass(type(int), type[int]) returning True, it will be confusing for users, and the implementation will be too cumbersome.

smheidrich · December 30, 2022, 5:01pm

@storchaka Regarding isinstance, I was going to say “but that doesn’t work on Unions either and they’re supported by singledispatch”, only to find out that it does work since 3.10 Touché. I guess since isinstance(int, type) already holds, this wouldn’t be much of a change, would it? A bit similar to the union case in that it would just extend isinstance to also hold for sensible PEP 483 variants of the 2nd argument.

Regarding issubclass, I see where you’re coming from (current singledispatch using issubclass relationship to define the “most specific” implementation to call), but that sounds much weirder for types: type(int) is just type and having issubclass(type, type[int]) be True doesn’t seem sensible, so your idea is probably to have type(int) return type[int] instead, right? But that sounds like it has more potential to break existing code, e.g. if people use type(x) == type to check whether something is a class (without a metaclass) or things like that.

I’m also not sure the implementation would be that much simpler with these changes: The current code uses attributes like __mro__ and __bases__ to find base classes of the type of the supplied argument. But even with isinstance and issubclass changed as described, I don’t think __mro__ and __bases__ would ever be usable for type arguments in this way (e.g. type(some_class).__bases__ returning a tuple of type[base_class]es). So AFAICT we’d still need to distinguish between these two cases, extract the “inner” type from type[inner] to use the inner type’s __bases__/__mro__ in composing the MRO, and so on, just as is done in my draft PR. This is again similar to the Union case, which is also handled specially in singledispatch by explicitly pulling out the types that are part of the Union, for which the MRO composition and so on then work as usual - the only difference is that Union only requires this extra handling at the point of registration, while type[...] also requires a bit of it when dispatching…

NeilGirdhar · December 31, 2022, 1:24pm

This is just bad code though. You should always be using issubclass or issinstance if possible. Is it really important to make sure people’s bad code continues to work?

smheidrich · December 31, 2022, 4:36pm

@NeilGirdhar Good point. I guess the way these decisions are normally made is to search through existing code to find usages that would break? I can try to have a look in some of the large well-known projects later. But perhaps the entire isinstance and issubclass debate should be split off into a separate thread.

In other news, @sobolevn had some other good objections to the proposal here. The last one especially got me thinking that maybe this is just overkill for the principal use case I had in mind (namely being able to elegantly represent deserialization of dataclasses etc. without making that part of the class itself).

stoneleaf · December 31, 2022, 8:30pm

The if possible there is the key – if it’s not possible, then it’s not bad code.

storchaka · December 31, 2022, 9:37pm

Right. Actually, in 3.11 we fixed a mess caused by isinstance(list[int], type) returning True, while list[int] does not have all properties of type. Making type[int] looking (or being) a subclass of type looks like a step in the opposite direction. To do this right we need to take into account a lot of consequences. It would be a large tectonic change in Python, and I am not sure we need it. I agree that it is is easier to only change singledispatch(), but I am not sure that it will not conflict with future more generic changes.

There is yet one though. Currently singledispatch() dispatches depending only on the type of the first argument. With your proposition it will also depend on its value (if it is a type). And if allow this for type[int], why not allow this for Literal[1]?

steven.daprano · January 2, 2023, 12:55am

Why is it bad code? Aside from the unnecessary use of == instead of is, what’s wrong with it?

If x is class, type(x) will return the metaclass, which will normally be type. That’s part of the language. Why is it “bad code” to rely on that fact?

So if you want to check whether something is a class, and don’t care about metaclasses, type(x) is type does exactly what you want.

Or have I missed something?

NeilGirdhar · January 2, 2023, 1:14am

It’s better to use isinstance because testing on types breaks polymorphism. Except for some extremely niche cases, you can usually use isinstance.

If you “don’t care”, then it’s better to say isinstance(x, type) since it will work with metaclasses as well. If, for some reason, you want to know if there’s a metaclass, then you’re probably forced to do what you wrote.

smheidrich · January 2, 2023, 10:47pm

I ended up only looking at CPython itself and SQLAlchemy but anyway: In CPython, type(cls) is used to arrive at a metaclass that is then instantiated at least once here and the aforementioned type(cls) is type check appears at least once in SQLAlchemy here and a bunch of times in the tests of both projects. So it would definitely break some code if type(cls) was ever changed to return type[cls] when applied to types. But I think another a more fundamental problem with such a change would be that AFAIK, PEP 484 style static typing is meant to be completely optional and hence typing-related things are kept completely separate from “regular” code wherever possible in the standard library (singledispatch being one exception, and even there it’s optional). But making type() return a parametrized generic would bring PEP 484 types right into the heart of Python’s ordinary, dynamic type system, so from that perspective, similar to what @storchaka wrote about issubclass, it would be a “tectonic change”, and making a singledispatch edge case look a bit more sensible wouldn’t be enough justification for that.

Well it would only be based on the value from the perspective of the implementation and as far as isinstance is concerned, but from the perspective of static type checking / PEP 484 it would be based on the type. But I guess that is your whole point, that these two worlds disagreeing with each other here makes it confusing, and I can see that.