How can I return an instance inheriting from a user type and a library type?

bradleyharden · August 21, 2024, 3:57pm

I can’t seem to figure out how to express this in the Python type system. Perhaps it’s fundamentally impossible? I’ve tried searching, but I can’t turn up any results that deal with my precise question.

Suppose I have some library code where I define a BaseType that is meant to be subclassed by users. Next, suppose I want to create an object that is both an instance of a user-defined type while also inheriting from a different library class. Is that possible to express?

Here’s a minimal example.

Library code:

class BaseType:
    base: str

class ExtensionType:
    ext: str

def add_extension[T: BaseType](user_type: type[T]):
    class CombinedType(user_type, ExtensionType):
        pass
    return CombinedType()

User code:

class UserType(BaseType):
    user: str

user_type = UserType()
user_type.base  # ok
user_type.user  # ok

library_extended = add_extension(UserType)
library_extended.base  # ok
library_extended.ext   # ok
library_extended.user  # error

This first approach doesn’t work. The value returned by add_extension is known to subclass BaseType but not UserType.

However, if the user creates the class manually, it works.

class ManuallyExtended(UserType, ExtensionType):
    pass

manually_extended = ManuallyExtended()
manually_extended.base  # ok
manually_extended.ext   # ok
manually_extended.user  # ok

That solution would normally be fine, but in my situation, users will define lots of types, and I don’t want them to have to write a bunch of redundant declarations.

It seems like there are two possible solutions. One would be a way to tell the type checker that the return value is a subclass of both T and ExtensionType. Stated differently, I want it to return an object that gives True for both isinstance(value, UserType) and isinstance(value, ExtensionType).

def add_extension[T: BaseType](user_type: type[T]) -> T + ExtensionType:  # Made up syntax
    ...

But I don’t know if it’s possible or how I would express that.

Alternatively, you could inherit from a generic type, i.e.

class CombinedType[T: BaseType](T, ExtensionType): ...

But that gives an error.

Is this possible?

bschubert · August 21, 2024, 4:06pm

This is commonly called “intersection types”. Here’s a few previous discussions/proposals:

github.com/python/typing

An Intersection type?

opened 08:53PM - 16 Oct 14 UTC

closed 12:33AM - 14 Jan 15 UTC

gvanrossum

Sometimes you want to say that an argument must follow two or more different pro…tocols. For example, you might want to state that an argument must be an Iterable and a Container (for example, it could be a Set, a Mapping or a Sequence). It would be nice if this could be spelled like this: ``` def assertIn(item: T, thing: Intersection[Iterable[T], Container[T]]) -> None: if item not in thing: # Debug output for it in thing: print(it) ```

github.com/python/typing

Introduce an Intersection

opened 09:27AM - 06 May 16 UTC

ilevkivskyi

topic: feature

This question has already been discussed in #18 long time ago, but now I stumble… on this in a practical question: How to annotate something that subclasses two ABC's. Currently, a workaround is to introduce a "mix" class: ``` python from typing import Iterable, Container class IterableContainer(Iterable[int], Container[int]): ... def f(x: IterableContainer) -> None: ... class Test(IterableContainer): def __iter__(self): ... def __contains__(self, item: int) -> bool: ... f(Test()) ``` but mypy complains about this ``` error: Argument 1 of "__contains__" incompatible with supertype "Container" ``` But then I have found this code snippet in #18 ``` python def assertIn(item: T, thing: Intersection[Iterable[T], Container[T]]) -> None: if item not in thing: # Debug output for it in thing: print(it) ``` Which is exactly what I want, and it is also much cleaner than introducing an auxiliary "mix" class. Maybe then introducing `Intersection` is a good idea, @JukkaL is it easy to implement it in mypy?

github.com/python/typing

Uses for Intersection in type hints

opened 12:45AM - 24 Nov 23 UTC

closed 11:13AM - 24 Nov 23 UTC

ViktorSky

topic: feature

The idea in proposing `Intersection` is to create type hints that support type t…ransformations, since it is not possible to support this behavior without using inheritance. Furthermore, it becomes much more complicated when trying to use it in overloads. ```python from typing import TypeVar, Protocol, TYPE_CHECKING, dataclass_transform ModelT = TypeVar("ModelT", bound=type) class CreatedModel(Protocol): def dump(self) -> str: ... @dataclass_transform() def create_model(cls: ModelT) -> ModelT: ... # adding attributes return cls class CustomerModel: id: int name: str if TYPE_CHECKING: assert issubclass(CustomerModel, CreatedModel) cls_type = CustomerModel # type[<subclass of CustomerModel and CreatedModel>] ``` In this case, I assume that this tool `typing.Intersection` exists. It should work similar to `typing.Union`, but the actual type meets the arguments of this as protocols ```python from typing import TypeVar, Protocol, dataclass_transform from typing import Intersection # not exist ModelT = TypeVar("ModelT", bound=type) class CreatedModel(Protocol): def dump(self) -> str: ... @dataclass_transform() def create_model(cls: ModelT) -> Intersection[ModelT, type[CreatedModel]]: ... return cls @create_model class CustomerModel: id: int name: str ``` This implementation would help the more efficient development of dynamic classes by linking with typevar so as not to lose a lot of information during its creation.

The lack of intersection types also got mentioned in the recent 2024 Python Typing Survey.

bradleyharden · August 21, 2024, 6:16pm

@bschubert, thanks for the links.

I guess you’re saying what I want is currently impossible. But, are there perhaps other, related constructions that come close?

This is a bit of an XY problem, so maybe I should explain in more detail what my end-goal was. Suppose I have a library like this:

class Field:
    @overload
    def __get__(self, block: RawBlock, block_type) -> bytes: ...

    @overload
    def __get__(self, block: Block, block_type) -> int: ...

    def __get__(self, block: Block | RawBlock, block_type) -> int | bytes: ...

class Block:
    pass

class RawBlock:
    pass

And some user code like this:

class MyBlock(Block):
    field = Field()

my_block = MyBlock()
my_block.field  # int

I would like to let users create the following my_raw_block object without having to manually define the corresponding MyRawBlock class.

class MyRawBlock(MyBlock, RawBlock):
    pass

my_raw_block = MyRawBlock()
my_raw_block.field  # bytes

Basically, I want there to be a correspondence between two different classes while only defining one class.

Daverball · August 21, 2024, 8:25pm

Yeah, you can’t really do that at all yet. You could at best get a partial solution that erases the dynamic portion of the intersection and replaces it with Any:

from typing import Any, type_check_only

class BaseType:
    base: str

class ExtensionType:
    ext: str

@type_check_only
class CombinedType(Any, BaseType, ExtensionType):
    pass


def add_extension[T: BaseType](user_type: type[T]) -> CombinedType:
    class CombinedType(user_type, ExtensionType):
        pass
    return CombinedType()

Although this is quite the ugly hack and I wouldn’t recommend doing it this way. Just manually extend the class. Even once we have type intersections, this is still kind of a bad idea outside of very simple cases, since there are corner cases where you end up with ambiguous types because intersections are traditionally unordered compared to proper subclasses.

bradleyharden · August 21, 2024, 9:48pm

@Daverball, did you see my follow up reply to @bschubert? That’s what I’m really hoping to achieve. Is there any way to implement that?

bradleyharden · August 22, 2024, 2:43am

This is a bit roundabout, but it does work. Is there any way to improve it?

Library code:

class Parsed:
    pass

class Raw:
    pass

P = TypeVar("P", bound = Parsed | Raw)

class Block[P = Parsed]:

    def raw[R: Block[Raw]](self, return_type: type[R]) -> R: ...

class Field:
    @overload
    def __get__(self, block: Block[Raw], block_type = None) -> bytes: ...

    @overload
    def __get__(self, block: Block[Parsed], block_type = None) -> int: ...

    def __get__(self, block: Block[P], block_type = None) -> int | bytes: ...

User code:

class MyBlock[P = Parsed](Block[P]):
    field = Field()

parsed = MyBlock()
parsed.field  # int

raw = parsed.raw(MyBlock[Raw])
raw.field  # bytes

Daverball · August 22, 2024, 5:48am

Yes, generics would be the only way to achieve this currently, you’d probably want to switch to a constrained TypeVar, rather than one with an upper bound of a Union, since the Union is not actually a valid choice, i.e. P: (Parsed, Raw) = Parsed.

But I would still consider this a hack and would not recommend it to be used in production code. I would probably just return something like:

class FieldAccess:
    @property
    def parsed(self) -> int: ...

    @property
    def raw(self) -> bytes: ...

from the Field descriptor^[1] instead, so your user code looks like this:

class MyBlock(Block):
    field = Field()

block = MyBlock()
block.field.parsed  # int
block.field.raw  # bytes

Unless it is really important that Block has a flat structure, it does not seem worth the trouble.

and probably reify it on the instance, so it only needs to be constructed once ↩︎

bradleyharden · August 22, 2024, 1:21pm

@Daverball, thanks for the thoughts. I haven’t used a constrained TypeVar before, so that’s a useful tip for me.

And yes, I agree much of this is hacky, and I don’t particularly like it. At this point, I’m just trying to see exactly what is possible and what isn’t.

I’ve thought about the FieldAccess pattern you describe, but I’ve been less than enthused about it. In my particular case, the parsed option is what you want 99+% of the time, and I don’t want to ergonomically tax that case for the 1% raw case.

I’ll have to think about this a bit more, to see if I can come up with something that’s close to my ideal API. I might end up accepting that the raw API is untyped. It’s not something that will be used often.