Class-scoped `type` statement that references outer-scoped `TypeVar`

erictraut · December 1, 2023, 4:47am

Using traditional ways to define a type alias, it has been illegal for a class-scoped type alias definition to reference an outer-scoped TypeVar. This makes sense because these traditional type alias mechanisms implicitly bind type variables to their scope, so the use of an already-bound TypeVar is ambiguous.

class A(Generic[T]):
    X1 = list[T]  # pyright & mypy error: T is already bound
    X2: TypeAlias = list[T]  # pyright & mypy error: T is already bound

With the new type statement introduced in PEP 695, type parameters for generic type aliases are explicit, eliminating the ambiguity in this situation.

class A[T]:
    # Pyright doesn't currently generate errors for these case. Should it?
    type X1 = list[T]
    X2 = TypeAliasType("X2", list[T])

Should we be consistent and enforce this rule for new-style type aliases? Or should we lift this restriction?

I personally don’t see a strong argument in favor of retaining this limitation. I’m therefore in favor of allowing this even though it is arguably inconsistent with older type alias mechanisms.

Jelle · December 1, 2023, 5:39am

I would also lean towards allowing this because it is unambiguous and such class-scoped type aliases are useful.

It’s worth noting that if you specialize the class, you won’t see the specialized value of the TypeVar if you access the alias’s value at runtime:

>>> A[int].X1.__value__
list[T]

However, that’s no different from the behavior in other contexts:

>>> class A[T]:
...     def f(self) -> T: ...
... 
>>> A[int].f.__annotations__
{'return': T}

So I don’t think it’s a reason to disallow class-scoped aliases that reference a type parameter.

Daverball · December 1, 2023, 8:55am

I wonder if this decision has any implications for potential future enhancements to runtime access of type params, as was proposed in this topic. Although I can’t immediately think of a concrete issue this would cause.

But with the current status quo the type syntax and backport via TypeAliasType definitely seems fine to allow, since there’s no longer any ambiguity as to whether the type var is bound to the type alias itself or to the class.

maltevesper · December 1, 2023, 8:57am

For what it’s worth, coming from C++, it was kind of confusing to come across these error messages.
While I understand that the old syntax had the scoping issues, I think there are more benefits to allowing this new behaviour than downsides:

since there is no ambigouity about what T is bound to, I would personally value the notion that T can be used anywhere in the scope higher, than artificially limiting the new syntax for consistency sake.
It is easier to grasp for people coming from other languages like C++

As a novice to typing I got a bit frustrated by errors about @erictraut 's first example, I assumed that since T is bound by Generic[T] in the class definition, I should be able to use it to generate a non-generic type alias.

Since I am the author of the pyright bugreport, I might as well outline my current usecase, so that there is at least one real world example:
I have to collect data from paged tables on a webpage, and would like autocompletion to work, when
accessing the fields of each row. Without the described behaviour I would have to repeat list[ROWTYPE] everywhere. While that is just a minor inconvenience, I am sure there are more complex real world examples where the benefits are larger.

class Table[ROWTYPE]:
    type DataType = list[ROWTYPE]
 
    def __init__(self):
        self._data : DataType = []

    def iter_rows(self) -> ROWTYPE:
        ...

    @property
    def data(self) -> DataType:
         return self._data 

    def load_table_page(self) -> DataType:
        self._data.extend(loaded_slice)
        return loaded_slice

TL;DR: I would prefer the greater expressiveness over consistency. Otherwise, it might end up like the jumping flea experiment: once the PEP 695 syntax is the norm, and the current syntax has become forgotten, it is hard to explain why T can not be used in all places inside the type scope.

Daverball · December 1, 2023, 9:14am

This motivating example actually pulls me slightly in the opposite direction on this. I think I would prefer being able to immediately see that a parameter/return value on a method in a generic class depends on one of its type params, so I would prefer to see DataType[T] over DataType.

Type aliases ideally should convey enough meaning with their name and use that you don’t have to go look at their definition.

That being said I am sure there are some more convincing examples out there, so I’m still +0.5 on this.

maltevesper · December 1, 2023, 9:56am

I had not considered that I could do Datatype[T]. I like the idea. The only downside I can see of that is if the type alias is used outside the class:
My c++ mind says I would have to write Table[T].Datatype[T] while I guess I could get away with Table.Datatype[T].

That said, I thought about examples and kept revolving around c++ examples of containers which define iterators and pointers based on the type to be contained, but these are either not meaningful (Pythons pointer equivalent) or are probably an against argument following your logic. Iterators would be defined as functions for which you said that you prefer explicit Datatype[T].

While writing this I realized one more argument:

Class Table[T]:
    type Datatype = list[T]

Class IntTable(Table[int]):
    ...

X: IntTable.Datatype = [1,2]

Versus

Y: IntTable.Datatype[int] = [1,2].       # being required to know the type parameter here is not so good

Also the new syntax makes it easy to explicitly export the underlying type

Class Table[T]:
    Underlying type = T

Note: since I wrote this on a phone I didn’t verify the examples.

maltevesper · December 1, 2023, 6:27pm

I realised I ignored the notes @Jelle left, since I did not care for the introspection, but for the autocompletion/inlay hints, which work:

Nevertheless, in the current state it renders a significant part of my previous post moot.
Given that the behaviour described by Jelle Zijlstra is rather unintuitive, is it a bug?
I am not sure how to fix it, should the use of Table[int] (rather than declaration class Table[T]), cause the creation of a new class object, where type aliases are specialized?

Code for inlay hints example

from typing import NamedTuple


class Table[T]:
    type DataType = list[T]

    def __init__(self, initial_data: DataType | None = None):
        self._data: Table.DataType = initial_data or []

    @property
    def data(self) -> DataType:
        return self._data

    def get_row(self, name: str) -> T:
        return self._data[0]


NameAge = NamedTuple("NameAge", [("name", str), ("age", int)])


class NameAgeTable(Table[NameAge]):
    pass


ages = NameAgeTable([NameAge("tom", 42)])

x = ages.get_row("tom")
y = x.age

grievejia · December 4, 2023, 6:55pm

From a consistency perspective I also slightly prefer no restriction on class-scoped type alias definitions.

Pyre historically does not even recognize type aliases defined inside class bodies. We don’t consistently reject those definitions which needs to be fixed, but if we did I’m ok with that behavior as well. In other words, either we don’t support type alias inside class toplevel, or we fully support it with no restrictions. The in-between state feels less than ideal to me.

And one of the reason we did not want to get into the problem of class-level type alias is that the semantics can be tricky here – how do we differentiate between class-level type alias definitions vs. class-level attribute definitions (or should there be any difference at all)? E.g.

class A[T]:
  X1 = list[T]
  X2: TypeAlias = list[T]
  type X3 = list[T]

# Should type checkers treat X1, X2, and X3 equivalently? If not, how to justify/explain the behavior?
# Here are some examples where we need to determine what should happen. `X` here is just a placeholder for `X1`, `X2` or `X3`:

y: A.X = ...  # is this an error? what kinds of list can be assigned to y if it's not?
y = A.X  # is this an error?
reveal_type(A.X)  # what should this be? is it legal?

Here’s another example w.r.t. inheritance:

# Same definition of class `A` before

# Would type checker allow these "aliases" to be "overridden" or not?
class B[T](A[T]):
  X1 = tuple[T]
  X2: TypeAlias = tuple[T]
  type X3 = tuple[T]

# Same as before, except here we only "override" the element type
class C(A[int]):
  X1 = list[str]
  X2: TypeAlias = list[str]
  type X3 = list[str]

erictraut · December 4, 2023, 7:23pm

Hey Jia, good to see you in this forum!

After reading your response, I realized that my original post wasn’t as clear as it could have been. I think you’re proposing a third option that I hadn’t considered. Let me enumerate the three options to make sure we’re on the same page.

I’m going to use the term “traditional alias mechanisms” to refer to the two mechanisms that existed prior to PEP 695. Traditional mechanisms are not allowed to use type parameters bound to outer scopes in their definitions today — by the rules of PEP 484 and PEP 613.

Here are the three options:

The limitation is retained for traditional alias mechanisms, and the same limitation is imposed on new type statements.
The limitation is retained for traditional alias mechanisms, but we remove the limitation for type statements.
The limitation is removed for both traditional alias mechanisms and new type statements.

If I understand you correctly, you’re in favor of option 3. I hadn’t even considered that option because I think the limitation placed on traditional alias mechanisms is there for a good reason. I don’t think they should be allowed to use type parameters that are already bound to an outer scope because that leads to ambiguities and bugs.

I’m suggesting that we should choose between option 1 or 2. I slightly prefer option 2 because there’s no good reason to retain the limitation for type statements other than consistency with the older mechanisms.

Let me know if I misinterpreted your response.

I’m curious if you have a preference between option 1 and 2.

Daverball · December 4, 2023, 9:21pm

While I agree that the third option should probably not be chosen, you could still make a decent case for it if you defined a consistent rule that all type checkers had to follow. ^[1]

E.g. we could specify that a type var used within the definition of an old style alias is always bound to said alias, i.e. foo = T1 | T2 | ... is always equivalent to type foo[T1, T2, ...] = T1 | T2 | ..., even if some of the type vars are also bound to an enclosing generic class, that would resolve the ambiguity. It would mean you couldn’t pick which one you meant with the old style, but at least you can still use them. It would also be consistent with how the old style type aliases work everywhere else.

in addition type checkers should probably emit a warning so people realize they’re using something that might not give them what they had wanted ↩︎

grievejia · December 5, 2023, 10:45pm

Thanks for the clarification, Eric!

Earlier I was under the (wrong) impression that PEP 695 type statement was meant to be merely a new syntactical alternative of the traditional aliasing mechanisms. I actually had not considered the option of having the new type statement semantics diverge from the traditional aliases – now that I think about it, as long as the type statement impose the same or less restrictions compared to the traditional aliasing mechanisms I don’t have strong opinions either way (more restrictions would be a problem though as that would create a hassle for migration).

But say we want to lift this restriction for type statement – I believe there still needs to be an answer to my earlier questions for the behavior to be fully-specified: is it allowed to have these aliases referenced from the containing class without specifying the type parameter (e.g. is A.X3 still a valid type annotation and if so should we just treat it as list[Any]?), and how does these aliases interact with inheritance (e.g. are subclasses supposed to “inherit” aliases from their parent class and if so to what extent are they allowed to re-bind those aliases).

And apologies for my absence from the forum! I did not realize the typing-sig activities were moved to this platform until Steven reminded me yesterday. Will definitely try to get myself more involved in the discussions going forward

stroxler · December 5, 2023, 11:56pm

I don’t see a problem with allowing non-generic type aliases in a class body to refer to a type parameter bound in the class, that seems more consistent than banning it to me.

The only question I think should be made explicit is whether the type alias may be used as an instance attribute. @Jelle suggested that we should be able to use

# class access
x: A[int].TypeAlias = ...

but I think we also want to explicitly state whether it’s legal to write something like:

# instance access
a: A[int] = A(15)
x: a.TypeAlias

which I think some users might expect to behave similarly if we don’t state otherwise.

I would much prefer not to support instance attributes as type aliases because it can introduce dependencies between type discovery and type inferences, which in some type checkers (including Pyre) are separate stages. Supporting code like #instance_access in global scope would require major changes to Pyre.

I do agree with @Daverball that the use of non-generic aliases making use of class-level type parameters looks a little funny / is less explicit. A generic type alias might actually be nicer, but that can always be the business of a style guide and linter.

It doesn’t seem like the type system should have non-obvious scoping restrictions just because some of us think the code looks awkward / would prefer explicit type parameters.

erictraut · December 6, 2023, 12:27am

The only question I think should be made explicit is whether the type alias may be used as an instance attribute. … I think we also want to explicitly state whether it’s legal to write something like… x: a.TypeAlias. … I would much prefer not to support instance attributes as type aliases…

I don’t think that’s an issue because a.TypeAlias isn’t a valid type annotation. Variables aren’t allowed in type annotation expressions, so this should already be flagged as an error by a type checker. Likewise, call expressions aren’t allowed in type annotations, so x: A[int]().TypeAlias should result in an error.

stroxler · December 6, 2023, 12:32am

This makes sense.

I guess it wasn’t immediately obvious to me that accessing a type alias through a variable is necessarily disallowed, but agreed that it should be and that this rules out the case I was worried about.

Topic		Replies	Views
PEP 695: Type Parameter Syntax PEPs	139	19220	September 6, 2023
Inconsistencies between `Type` and `type` Typing	11	1672	November 26, 2023
Unbound __init__ of subscripted generic base class - CPython bug? Typing help	10	615	January 27, 2024
Non-uniqueness of TypeVar on Python versions <3.12 causes resolution issues Typing typing	2	1016	October 28, 2023
Identity type alias Typing	12	597	December 1, 2023

Class-scoped `type` statement that references outer-scoped `TypeVar`

Related Topics