Protocol-only intersections

ntessore · February 8, 2025, 11:03am

Intersections of protocols are well-defined in the typing spec. Is there any interest in, or perhaps even work towards, making it possible to declare intersections of protocols without subclassing?

I realise this would be a stop-gap until full intersections arrive in the typing system, but it would be a useful one. It wouldn’t need the full machinery of overloading the & operator etc. but could be a specialised construct like ProtocolIntersection or similar. The typing-protocol-intersection provides exactly that as a mypy plugin, but it would be great to have it in typing-extensions and eventually the spec itself.

Daverball · February 8, 2025, 1:16pm

If we’re going to introduce a stop-gap measure for protocols^[1] it should use whatever special form and/or operator that we think we will end up using for a more general type intersection.

Introducing a specialized ProtocolIntersection seems like a mistake to me, since later on that will add confusion about what the difference between Intersection & and ProtocolIntersection is for protocols.

Besides that, there’s the issue that the semantics are not actually as clear-cut and well-defined as you seem to believe. A subclass relationship has a clear resolution order for overlapping attributes. A type intersection on the other hand is generally unordered, so it’s not the same as defining a new protocol using two existing ones as base classes. Unless you’re suggesting ProtocolIntersection is actually its own thing and ProtocolIntersection[A, B] is equivalent to class _(A, B, Protocol): .... Although that would have an even bigger potential for causing confusion down the line^[2].

I think we’re better off waiting for true intersections. Although progress is slow, people are still actively working on it and red knot, the type checker Astral^[3] are currently working on has early support for type intersections, which is very valuable for verifying whether or not they can come up with semantics that hold up once combined with more advanced features like generics.

I’m not necessarily convinced that we should ↩︎
There is an ongoing alternative proposal for an ordered intersection, which would end up being essentially exactly that, syntactic sugar for an anonymous subclass ↩︎
the guys behind ruff and uv ↩︎

JamesParrott · February 8, 2025, 1:37pm

without subclassing

What’s the advantage of avoiding inheritance, other than aligning with certain coders’ preferences? Why is this sufficient motivation to undertake all the necessary work?

Subclassing and CMI is an intrinsic part of the Python language, and defining a new protocol that is each of its parent protocols seems like a great use of multiple inheritance to me, and equivalent enough to intersection.

ntessore · February 8, 2025, 1:52pm

Yes, I was merely imagining the obvious thing here, syntactic sugar for the equivalent subclass.

Similarly, it’s not about the avoidance of subclassing, it’s only about quality of life. If you have a situation where you have a lot of protocols such as

class SupportsFoo(Protocol):
    def foo(self) -> None: ...

class SupportsBar(Protocol):
    def bar(self) -> None: ...

...

the combinatorial explosion makes it awkward to define all possible protocol subclasses explicitly.

JamesParrott · February 8, 2025, 2:18pm

Oh I see, the power of it (and how the combinatorial explosion is avoided) is how it combines with generics. If all the bugs have been sorted out, it looks to me like typing-protocol-intersection’d be a good experimental feature for typing-extensions.

mikeshardmind · February 9, 2025, 3:09am

I’m negative on & being implemented on Protocol. I’m lukewarm, but still positive on a means of merging protocols programmatically. If we do it, I’d actually go the opposite direction of what @Daverball has suggested. Just state that it’s only valid for disjoint protocols, define what it means for a set of protocols to be disjoint, and let the fact that it may or may not later be equivalent to intersections up to how we later define those. (The edge cases come up for non-disjoint intersections, CC: @Liz as she had some ideas with this over in the intersection work)

Liz · February 9, 2025, 5:16am

I don’t remember if I made a strong public comment explaining that limited option, if I did, it’s buried in the way github collapses long discussions. There’s a proof either in the github discussion or in the discord that a limited intersection between 2 protocols that do not define any overlapping methods differently is equivalent to every so far considered and not ruled out definition of a full intersection for those protocols, so that’s a safe starting point. I think not explicitly stating that equivalence in user facing documentation is fine, but should be part of the argument for creating this.

You have to handle the case where protocols are generic carefully, I would expect that you need to do it in a way that allows distinguishing these two things, as they have different meanings:

type ProtoAB[TAB] = ProtoA[TAB] & ProtoB[TAB]
type ProtoAB[TA, TB] = ProtoA[TA] & ProtoB[TB]

This may make it awkward syntactically if we can’t expect to be able to use type alias statements.

ntessore · February 9, 2025, 8:50am

Elizabeth King:

You have to handle the case where protocols are generic carefully, I would expect that you need to do it in a way that allows distinguishing these two things, as they have different meanings:
type ProtoAB[TAB] = ProtoA[TAB] & ProtoB[TAB]
type ProtoAB[TA, TB] = ProtoA[TA] & ProtoB[TB]
This may make it awkward syntactically if we can’t expect to be able to use type alias statements.

I’m not sure it would be awkward: Assuming ProtocolIntersection[PA, PB, ...] is exactly equivalent to an anonymous class _(PA, PB, ..., Protocol) then spelling it out with TypeVars seems to work as expected:

from typing import Protocol, TypeVar

class ProtoA[T](Protocol):
    a: T

class ProtoB[T](Protocol):
    b: T

# ---

TA = TypeVar("TA")
TB = TypeVar("TB")

class ProtoAB_1(ProtoA[TA], ProtoB[TB], Protocol): ...

def f1(x: ProtoAB_1[TA, TB]) -> tuple[TA, TB]:
    return x.a, x.b

# ---

TAB = TypeVar("TAB")

class ProtoAB_2(ProtoA[TAB], ProtoB[TAB], Protocol): ...

def f2(x: ProtoAB_2[TAB]) -> tuple[TAB, TAB]:
    return x.a, x.b

# ---

class Concrete_1:
    a: int
    b: str

class Concrete_2:
    a: float
    b: float

reveal_type(f1(Concrete_1()))
reveal_type(f2(Concrete_2()))

jamesdow21 · February 9, 2025, 6:00pm

I don’t remember exactly where I saw it on one of the previous threads (or it might have been on an issue in on GitHub), but there was a suggestion to optionally allow one concrete class to be included in the intersection with Protocols.

That feature would then allow this to do more than just being a shorthand for the current ability to combine protocols through making a subclass, since a new Protocol can only have other Protocols as base classes.

My own particular use-case for this is to annotate an instance of a class where an optional attribute is present, without having to make a subclass, e.g.

@dataclass
class Foo:
    bar: int | None

class HasBar(Protocol):
    bar: int

def foobar(foo: Foo & HasBar) -> int:
    return x.bar

ntessore · February 10, 2025, 12:52am

Introducing new behaviour, no matter how tiny, is precisely what I do not want to do here. According to the typing spec, this is well-defined:

class _(ProtoA, ProtoB, ..., Protocol): ...

There may, in truth, be unresolved corner cases, but I am allowed and encouraged to write this today. Introducing an anonymous inline variant of the exact same thing would be an immediate quality-of-life improvement (YMMV) with zero deviations from current behaviour.^[1]

If there is a worry about the interplay with future behaviour of intersections, call the new construct MergedProtocol, say, to remove all doubt. ↩︎

Liz · February 11, 2025, 6:31pm

I would spell out the intended equivalences in your proposal, even if you think it’s already spelled out. There have been other cases with generic protocols, type variables with bounds, and intersections (via TypeIs) and mypy introducing Any where no gradual typing was present because it gave up on finding a solution for the type variables

ntessore · February 12, 2025, 9:07am

Thanks @Liz. I propose to

Add a special form X (e.g. ProtocolIntersection or MergedProtocol) such that
X[First, *Rest]
where First is a type form, and Rest is any number of further type forms, is identical to declaring an anonymous protocol
class _UniqueName(First, *Rest, Protocol): pass
with _UniqueName replaced by a unique name, and using said anonymous protocol in the place where the new special form was written.

The special form should not return the equivalent subclass at runtime.