Hi all,
This idea is motivated by my earlier question here.
At a high level, the idea is to make something like this “work”: the case statements should match, and type checkers should be able to infer that the match is exhaustive.
Currently, the example below is broken because the (it works at runtime but static type checkers cannot understand it): there is no place in the program to suggest that the two case
statements will never match@dataclasses
defined inside Shape
are supposed to be Shapes
’s variants.
from dataclasses import dataclass
class Shape:
@dataclass
class Square:
center: tuple[float, float]
length: float
@dataclass
class Circle:
center: tuple[float, float]
radius: float
def area(s: Shape):
match s:
case Shape.Square(center, length): # type checkers claim that
return length * length # these patterns will never match!
case Shape.Circle(center, radius):
return 3.14 * radius * radius
We can get pretty close to desired behavior by using this: the match statement works, and (most?) typecheckers are able to verify that the match is exhaustive.
from dataclasses import dataclass
@dataclass
class Square:
center: tuple[float, float]
length: float
@dataclass
class Circle:
center: tuple[float, float]
radius: float
type Shape = Square | Circle
def area(s: Shape):
match s:
case Square(center, length):
return length * length
case Circle(center, radius):
return 3.14 * radius * radius
The issue with this approach — which is what I’m proposing to fix — is that Square
and Circle
are not well encapsulated.
In the following example, Square
is redefined for PaymentEndpoint
, rendering the case
statement incorrect in the area(s) function incorrect.
from dataclasses import dataclass
@dataclass
class Square:
center: tuple[float, float]
length: float
@dataclass
class Circle:
center: tuple[float, float]
radius: float
type Shape = Square | Circle
@dataclass
class Square:
token: ...
@dataclass
class Stripe:
token: ...
type PaymentEndpoint = Square | Stripe
def area(s: Shape):
match s:
case Square(center, length): ## <-- uh oh!
return length * length
case Circle(center, radius):
return 3.14 * radius * radius
This example is a little contrived: the problem goes away if Shape
’s and PaymentEndpoint
’s were placed in different files. And they probably should be! But it’s not hard to imagine(*) cases where it’s convenient to have multiple union types defined in the same file with the same variant names; these use-cases is not covered right now.
Another more abstract downside: the construct type U = X | Y
only informs U
that it is related with X
and Y
; it doesn’t tell X
(or Y
) that they are related to U
.
Proposal. The concrete proposal is to introduce a new marker (kind of like dataclass
): with the decorator, member variables become variants. Thus, a one/two-liner change to the first example would make it work:
from dataclasses import dataclass
from dataclasses import marker # <- new!
@marker # <- new!
class Shape:
@dataclass
class Square:
center: tuple[float, float]
length: float
@dataclass
class Circle:
center: tuple[float, float]
radius: float
# And now this works:
def area(s: Shape):
match s:
case Shape.Square(center, length): # these patterns will never match!
return length * length
case Shape.Circle(center, radius):
return 3.14 * radius * radius
(marker
is deliberately a poor name — just want to focus on the idea/interface instead of the proper naming.)
Curious to hear what others think. Thanks all!