Draft PEP: Sealed decorator for static typing

As you and the draft PEP point out, you can make some of inheritance work by splitting the base class into a private base class and a public union type. But some things cannot be made to work like that, at least as far as I could figure out. It’s not the constructor I am interested in. The biggest problem comes in the class methods. SomeSumType.foo() will never work. You can’t pass SomeSumType to Pydantic or FastAPI or Typer and expect it to ever work, at least not if you want to override any of the serialization/deserialization defaults.

You have to lose some Python functionality of the base class when publicly exposing a Union. It just isn’t the same thing as a base class. A sealed decorator (or alternative syntax) on the base class adds exhaustiveness checking to the base class without compromising its other capabilities.

I don’t think you need these to work. There’s no reasonable interpretation for what class to bind in the class method. Write a function not a method.

I don’t use pydantic. Unions work fine with cattrs and msgspec, so it sounds like this is an issue that pydantic can solve without sealed

1 Like

That feels like a limitation of those frameworks that they could overcome.

1 Like

I’m curious if this is a case where an ordered intersection could help (specifically of the base class and the union), if anyone working on that is in this discussion. Does the resulting type have different behaviour from the simple union in a way that might help?

If a list of allowed sub-classes, in example below Circle, is given when declaring the base class, in example Shape, the implementation modest (therefore acceptable to add to core since not a maintenance burden).

from dataclasses import dataclass


def sealed(allowed):
    def sealed_wrapper(sealed_cls):
        def new(sub_cls, *args, **kwargs):
            full_name = sub_cls.__module__ + '.' + sub_cls.__qualname__
            assert full_name in allowed, f"'{full_name}' not in allowed class list: {allowed}."
            return super(sealed_cls, sub_cls).__new__(sub_cls)

        sealed_cls.__new__ = new
        return sealed_cls

    return sealed_wrapper


@sealed(['__main__.Circle'])  # List allowed sub-classes for sealed base-class. 
@dataclass
class Shape:
    x: float
    y: float


@dataclass
class Circle(Shape):  # Allowed.
    radius: float


@dataclass
class Square(Shape):  # Rejected, not on list.
    side: float


if __name__ == '__main__':
    c = Circle(1, 2, 3)
    print(c)
    s = Square(4, 5, 6)

Circle works as expected and prints Circle(x=1, y=2, radius=3).

Whereas Square is rejected with message AssertionError: '__main__.Square' not in allowed class list: ['__main__.Circle'].

Is this an acceptable solution?

1 Like

What happens if I use your example as some_library add something like this to my code:

from some_library import Node

# opt A
class Declaration(Node): ...

# opt B
@sealed
class Comment(Node): ...

Will I get a run-time error?
A type error in a static type checker?
Will the Node union automatically get extended?

Not very beautiful and the semantics are not… great but if TYPE_CHECKING: Node = Leaf | Branch ?

I think that what you bring up for enum.Enum is more interesting in a way. Enum offers the sealed nature and the shared method (and you can get class methods “for free”).

Current enum.Enum’s big problem is not supporting the inner classes, but a lot of Enum functionality doesn’t make sense in that model anyways! You can’t enumerate over all the variants of your Message class, for example.

Having said that, enum.ADT? Kinda rolls off the tongue, would let us use auto, could build off of some of enum.Enum, while turning off features that don’t make sense like enumeration. And the typecheckers build off of that no problem.

Perhaps the abc module would also make sense for it. The indentation naturally makes the restriction that the classes must be defined in the same module immediately clear.

Neither declaration would pass type checking because Node is declared in some_library and neither of these subclasses of Node is defined in the same file as Node.

I was wondering what sort of cost that would push onto library developers.

On one hand, many libraries do list all of their types in package/types.py or package/models.py, for example requests.

On the other, generated sdk bindings are typically hairy tree structures, e.g. aws.

And then there’s a bunch in between, e.g. httpx/_transports/base.py that’s inherited by individual transports in own files.


Other:

  • I too find the scoping to “the same file” a little odd. I can’t think of a precendent.
  • I feel that proposed PEP conflates ADT and class hierarchies:

There’s an established product type namedtuple('EmployeeRecord', 'name, age, title').
There’s a sum type, which you want to improve foo: Expression|Statement.

Note that both of these are orthogonal to classes and inheritance.
I feel that a successful PEP would be too.

3 Likes

If you take the code I posted above, immediately before your post, you can do what I think you want with:

from some_library import Node

@sealed(['__main__.Declaration', '__main__.Comment'])
class MyNode(Node): ...

class Declaration(MyNode): ...

class Comment(MyNode): ...

Which restricts what classes can inherit from MyNode, but not Node. This is assuming that Node isn’t sealed in the library, if it is then only the classes listed with Node’s declaration can inherit from Node.

@hlovatt I don’t think that’s what Dima wants, because their question assumes that Node is sealed.

^ Sorry I accidentally sent before finishing the reply, so I deleted it and post a new one here.

What if we force a sealed class to explicitly list all of its subclasses, then we could get around the restriction that all subclasses must be in the same module.

# ----- foo.py ------
from typing import TYPE_CHECKING, sealed

if TYPE_CHECKING:
    from bar import C

@sealed("A", "B", "C")
class Root: ...

class A(Root): ...
class B(Root): ...

# ----- bar.py ------
from foo import Root

class C(Root): ...

I think it’s good to have the freedom of moving class definitions around, otherwise it would be a maintenance burden. The tradeoff is it takes a few more characters to list out the subclasses. And I don’t see that as a bad thing anyway, I think it’s actually more readable.

Not all inheritance trees should be sealed. Looking at BaseTransport in httpx, that would be a bad candidate to make sealed, because there is no reason to match on those subtypes or restrict inheritance by other projects.

I think this is a good discussion point. Python has tuple[str, int, str], a perfectly functional anonymous product type. So why did Python add namedtuple? And given the existence of namedtuple, why did Python add @dataclass? Because Python has a rich set of features through its classes and inheritance that could work very nicely with ADTs. Conflating (as you say) product types and class hierarchies through the @dataclass decorator was one of the best features Python ever added. Conflating (as you say) sum types and class hierarchies is unashamedly the intention of @sealed.

I think we have a different understanding of what it means for programming language features to be orthogonal. I always took it to mean that each feature could be used whether or not the other was used. Under that definition, Union is clearly not orthogonal to class methods. It’s ok, tuple and namedtuple have the same limitation; that’s why (among other reasons) we added dataclass.

Because using ordering when you should have used names is a horrible API decision and a constant source of bugs, but people still wanted a cheap data type that worked like a tuple.

Just take your example: what are those two strings? What is determining their order? Why is there an int between them? Without names this type is a landmine.

Orthogonal means solving different problems in a way that can be composed. A union solves “I have one of these things”, a named tuple solves “I have all of these things”.

If class methods cannot be composed with unions, the solution isn’t to reinvent unions, it’s to use a composable alternative to class methods.

The @sealed I posted above (better version below) does what you suggest. If sealed.py, shape.py, and triangle.py are packages.

shape.py:

from dataclasses import dataclass
from sealed import sealed


@sealed('shape.Circle', 'triangle.Triangle')
@dataclass(frozen=True)
class Shape:
    x: float
    y: float

@dataclass(frozen=True, slots=True)
class Circle(Shape):
    radius: float


@dataclass(frozen=True)
class Square(Shape):
    side: float

triangle.py:

from dataclasses import dataclass

from shape import Shape


@dataclass(frozen=True)
class Triangle(Shape):
    side: float

Then:

from triangle import Triangle
from shape import Circle, Square

c = Circle(1, 2, 3)
print(c)
t = Triangle(4, 5, 6)
print(t)
s = Square(7, 8, 9)

Gives (as expected):

Circle(x=1, y=2, radius=3)
Triangle(x=4, y=5, side=6)
AssertionError: `shape.Square` not in allowed class list: ('shape.Circle', 'triangle.Triangle').

Which is what you wanted I believe :slight_smile:

I actually used an improved version of sealed.py:

def sealed(*allowed):
    def sealed_wrapper(sealed_cls):
        def check(sub_cls):
            full_name = sub_cls.__module__ + '.' + sub_cls.__qualname__
            assert full_name in allowed, f"`{full_name}` not in allowed class list: {allowed}."

        def new_1_arg(sub_cls):
            check(sub_cls)
            return sealed_cls.__sealed_old_new__(sub_cls)

        def new_3_arg(sub_cls, *args, **kwargs):
            check(sub_cls)
            return sealed_cls.__sealed_old_new__(sub_cls, *args, **kwargs)

        assert not hasattr(sealed_cls, '__sealed_old_new__'), \
            f"@sealed cannot be applied twice, already applied to or inherited by `{sealed_cls.__name__}`."
        sealed_cls.__sealed_old_new__ = sealed_cls.__new__
        sealed_cls.__new__ = new_1_arg \
            if len(signature(sealed_cls.__new__).parameters) == 1 \
            else new_3_arg
        return sealed_cls

    return sealed_wrapper

This whole idea is based around supporting algebraic data types, but it only lets you support one algebraic data type for any given class, and worse, you have to own that class! What if you want to reuse that class in another type with a different set of members? You can’t.

Unions, on the other hand, can be used with any type, and a type can participate in any number of unions. They keep the algebra orthogonal to your class hierarchy.

There are languages like OCAML which are considered “role models” for ADTs; they do not use sealed class hierarchies at all. Java did not introduce “sealed” for ADT support, but for security and data modelling reasons, and notably as a language it lacks unions.

On another thread about solving some issues with unions raised here, someone said (paraphrasing) “we shouldn’t do this, we should focus on sealed class hierarchies”. I strongly disagree with this. We should focus on adding value to existing language elements, not introducing competing ones from other languages.

6 Likes

I don’t think so (CC: @mikeshardmind I guess?). The intersection of the base with the union of subclasses would have the same shape as the union of subclasses, and intersections only describe valid types, they aren’t a container for values. You would need dependent types as a type system addition to describe ADTs from just Unions.

1 Like

Ordered intersections should be seen as completely separate from ADTs. Python’s union type is not a sum type, and even if it were, intersections would not help with what is proposed here. I agree with the assessment that dependent types (which have other uses beyond this) would solve this for static use without any other additions, but I think it would require a new class-type or other runtime enforcable way of expressing it to have value beyond typing, where this is already solved with just unions.

If people have further questions about intersections and what I anticipate they can solve, I would invite them to come read the work in progress or ask in the related workgroup on github rather than end up too far off-topic in other threads.

I think this is a good way to think about programming language features. That’s really what I am trying to achieve with this proposal–adding the sum type capabilities of Union to classes, something that I frequently need. But I don’t think this proposal deprecates Union anymore than dataclass deprecates Tuple. The relationship between these ideas can be summed up nicely with this table.

theory anonymous type nominal type
product type Tuple[A, B] @dataclass
sum type Union[A, B] @sealed

Obviously, there are alternative designs than @sealed that fulfill the role of a nominal sum type in Python (e.g. @enum, __sealed__ = A|B, class X() of A|B: etc.), but that there is a hole here to be filled that complements Union rather than supplants it.