Treatment of `Final` attributes in dataclass-likes

The very first parts of the docs for data classes

ref

from dataclasses import dataclass

@dataclass
class InventoryItem:
    """Class for keeping track of an item in inventory."""
    name: str
    unit_price: float
    quantity_on_hand: int = 0

    def total_cost(self) -> float:
        return self.unit_price * self.quantity_on_hand

will add, among other things, a __init__() that looks like:

def __init__(self, name: str, unit_price: float, quantity_on_hand: int = 0):
    self.name = name
    self.unit_price = unit_price
    self.quantity_on_hand = quantity_on_hand
The original PEP abstract has

A class decorator is provided which inspects a class definition for variables with type annotations as defined in PEP 526, “Syntax for Variable Annotations”. In this document, such variables are called fields. Using these fields, the decorator adds generated method definitions to the class to support instance initialization

emphasis mine

and the living spec:

Examples of each approach are shown in the following sections. Each example creates a CustomerModel class with dataclass-like semantics. The implementation of the decorated objects is omitted for brevity, but we assume that they modify classes in the following ways:

  • They synthesize an __init__ method using data fields declared within the class and its parent classes.

Each detail the intent and that these generate instance instantiation. From that, Final should have the meaning it has in an instance variable, because dataclasses are the special part, and Final needs to be looked at in the proper context, in this case, we need to look at final after the special behavior that dataclass_transform has.

1 Like

Out of curiosity, I checked the runtime behaviour (under Python 3.11) of other dataclass-like constructs widely used in the Python ecosystem (typing.NamedTuple, attrs, and pydantic). The inclusion of attrs here might be questionable, given that the standard library’s dataclasses originated as a slimmed-down implementation of attrs. attrs and pydantic both employ typing.dataclass_transform.

  • typing.NamedTuple: typing.ClassVar is treated the same as typing.Final - both are invalid annotations, and will crash at runtime (without __future__.annotations):
    import typing as _t
    
    class Data(_t.NamedTuple):
        a: _t.ClassVar[int]  # TypeError: typing.ClassVar[int] is not valid as type argument
        b: _t.Final[int]  # TypeError: typing.Final[int] is not valid as type argument
        c: _t.Final = 0  # TypeError: Plain typing.Final is not valid as type argument
        d: _t.Final[int] = 0  # TypeError: typing.Final[int] is not valid as type argument
    
  • attrs: AFAICT behaves identically to @dataclasses.dataclass at runtime:
    import typing as _t
    import attrs
    
    @attrs.define
    class Data:
        a: int
        b: _t.Final[int]
        c: _t.Final[int] = attrs.field()
        d: _t.ClassVar[int]
        e: int = 0
        f: _t.Final[int] = 0
        g: _t.Final[int] = attrs.field(default=0)
        h: _t.ClassVar[int] = 0
    
    >>> Data(0, 1, 2, 3, 4, 5)
    Data(a=0, b=1, c=2, e=3, f=4, g=5)
    >>> Data(a=0, b=1, c=2, e=3, f=4, g=5)
    Data(a=0, b=1, c=2, e=3, f=4, g=5)
    
  • pydantic:
    • An annotation including typing.Final without a default is treated as a field / instance variable
    • An annotation including typing.Final with a default is treated as a class variable
    import typing as _t  
    import pydantic
      
    class Data(pydantic.BaseModel):
        a: int                                        # Field
        b: _t.Final[int]                              # Field
        c: _t.Final[int] = pydantic.Field()           # Field
        d: _t.ClassVar[int]                           # Not a field
        e: int = 0                                    # Field
        f: _t.Final[int] = 0                          # Not a field
        g: _t.Final[int] = pydantic.Field(default=0)  # Not a field
        h: _t.ClassVar[int] = 0                       # Not a field
    
    >>> Data(a=0, b=1, c=2, d=3, e=4, f=5, g=6)
    Data(a=0, b=1, c=2, e=4)
    

My take on this is that it is nowhere near status quo that a Final on an annotation in a data-like class means precisely a data field or instance variable on a data structure:

  • typing.NamedTuple is consistent in treating both typing.ClassVar and typing.Final as invalid data field annotations, making an explicit check for both of these. This is fully compliant with the spec as-is - there is no concept of using typing.Final as a field annotation to indicate, for example, that a subclass’s field’s type can or cannot be overridden.

  • pydantic seems to be compliant with the spec in another way - by following exactly what PEP 591 states already,

    Type checkers should infer a final attribute that is initialized in a class body as being a class variable.

    and treating an implicit __init__ present meaning delayed instance variable initialisation, allowing a Final with no default to mean an instance variable.


EDIT: As for actual opinions (not just behaviour) on the matter:

attrs:

  • 3 users think that typing.Final can be used as a per-dataclass-field equivalent of frozen from the newly-introduced PEP 591, but this is an incomplete reading of the PEP:
    • There was no discussion of instance versus class variable when Final is set with a default value - just that the Final qualifier could be re-used to indicate a frozen field.
    • The belief that Final can be a substitute for a per-field frozen is incorrect, as dataclass(frozen=True) does not prevent re-declaration of fields in subclasses, which is useful for changing default values in fields.
  • 1 user proposes that Final with a default to be changed to be a ClassVar, recognising this as a breaking change. The BDFL (hynek) doesn’t have an opinion either way.
  • 2 or 3 users think that it’s a bug for Final with a default value to indicate a field, based on the reading of PEP 591. The BDFL doesn’t have an opinion either way. Incorrect treatment of Final class variables when auto_attribs=True · Issue #784 · python-attrs/attrs · GitHub
  • Like the standard library’s dataclasses, there is no part in the code of attrs which handles or mentions typing.Final.
  • My takeaway: The current behaviour of attrs (like the standard library’s dataclasses) is an oversight in not specifying the behaviour of typing.Final with @attrs.defined-classes when it was introduced. Most people giving feedback to attrs after reading PEP 591, specifically the part which mentions class variables with a default, think that a Final with a default value being an instance variable / field at runtime is surprising behaviour.

pydantic:

Two excerpts from the typing spec:

Except where stated otherwise, classes impacted by dataclass_transform , either by inheriting from a class that is decorated with dataclass_transform or by being decorated with a function decorated with dataclass_transform , are assumed to behave like stdlib dataclass .

One of the few places where dataclass() actually inspects the type of a field is to determine if a field is a class variable as defined in PEP 526. It does this by checking if the type of the field is typing.ClassVar . If a field is a ClassVar , it is excluded from consideration as a field and is ignored by the dataclass mechanisms.

The spec says when it should be treated as a classvar, and that’s when actually using ClassVar It also says that other libraries using dataclass_transform (such as pydantic) are signing up to match the standard library behavior.

And we’re only looking at the behavior of dataclasses. Other libraries (such as attrs and pydantic) that use dataclass_transform should be assumed to match the behavior of standard library by the contract of dataclass_transform.

This wording is still an under specification. It needs to spell out that annotated assignment statements defined as class variables in the future (a la PEP 591: var: typing.Final = <default value> in classes, which came after PEP 526) do not count as class variables for the purposes of registering dataclass fields, if the proposal in this thread were to make sense.

PEP 681 was written for Pydantic, SQLAlchemy, and Django. Yes, I suppose that approval of Final with a default value meaning a ClassVar in Pydantic (August 2022) after the acceptance of PEP 681 (June 2022) is Pydantic going against the spec. However, I don’t think it’s productive to split hairs over dates of acceptance of features, especially when those features have under-specified wording.

I really don’t see a world in where this stance makes sense.

  • dataclasses says that all do with exceptions, then lays out the specific exceptions. Final did not amend these exceptions.
  • It is specifically a feature designed to allow type checkers to be aware of and match runtime behavior, which was not changed by the addition of Final.

The peril of special casing things in the type system is that people have to think about the reasonable composition including future cases, but in this particular case, I think it’s not only exceptionally well specified and clear, but that this is also just the correct behavior, if Final was treated as a classvar without specifying so in dataclasses, you’d need to write you own __init__ in a dataclass to get reasonable behavior, removing the benefits of it.

I’ve submitted a pull request to the typing spec.

4 Likes

The Typing Council has approved the change. Thanks, Carl!

4 Likes