Runtime access to type parameters

Gobot1234 · October 30, 2023, 1:36pm

I’d like to propose a new API for accessing type parameter values at runtime using for classes using PEP 695 generic syntax. I’m not sure if it’s a bit too early to be proposing these because it requires PEP 718 (subscripting functions at runtime) to be accepted and optionally PEP 696 (type parameter defaults) but I think it’s still an interesting idea either way.

The full list of things I’d like to do:

Adding an __args__ attribute that returns the specialised parameters to the class
Adding any type parameters to be available directly on the class as overwrite-able instance variables
Substitution of default type parameters at runtime (assuming PEP 696 is accepted)
Automatically adding __orig_class__ to a class’s slots if it’s subscriptable (even if it’s defined in C)

The following change to TypeVarLikes:

Adding __value__ as a way to compute the specialised value of a type parameter after subscription (if it has non-default parameters)

The following change to GenericAliases:

Hooking __getattr__ to handle accessing __args__ by name on the instance.

Motivation

Currently getting the specialised types for Generic types is unintuitive and unreliable

class Foo[T]: ...

Foo[int]()  # How do I get `int` inside Foo?

>>> Foo[int]().__orig_class__.__args__
(int,)

This however doesn’t work inside __new__/__init__ or any methods called from them as GenericAlias.__call__(*args, **kwargs) only sets __orig_class__ after self.__origin__(*args, **kwargs) returns.

class Bar[T]:
    def __init__(self):
        self.__orig_class__

>>> Bar[int]()  # AttributeError: Bar has no attribute __orig_class__

Now what about if I subclass a generic?

class Bar(Foo[str]): ...  # how do I now get `str`?

>>> types.get_original_bases(Bar)[0].__args__
(str,)

And what about a type parameter inside a generic function?

def foo[T](): ...

>>> foo[int]()

This isn’t even possible without using implementation details/frame hacks.

With the new roots of runtime type checking beginning to sprout, I think it’s unacceptable to have this kind of hard-to-use interface which is full of edge cases.

e.g.

class Slotted[T]:
    __slots__ = ()


Slotted[int]().__orig_class__  # AttributeError: 'Slotted' object has no attribute '__orig_class__'

I propose a new interface design which solves all of the above problems by being easy to use and much more reliable:

>>> Foo[int]().__args__
(int,)
>>> Foo[int]().T.__value__
int

>>> Bar.__args__
(str,)
>>> Bar.T.__value__
str

def foo[T]():
    return T.__value__

>>> foo[bool]()
bool

Anecdotally I’ve seen many requests for such a feature and I’ve needed it multiple times when writing typed code to get type parameters without duplicating values throughout code.

Prior discussion:

I can send a more complete draft of this once I have a better idea of how to implement this in cpython.

Thanks!

davidism · October 31, 2023, 5:59pm

5 posts were split to a new topic: Dealing with forward refs at runtime

Gobot1234 · October 31, 2023, 5:30pm

Sorry to interrupt but please could you discuss this in another more relevant discourse thread?

davidism · October 31, 2023, 6:00pm

I’ve moved that discussion to a new topic. You can always flag a post as well to let the mods know something needs attention.

Daverball · November 1, 2023, 9:32am

I like the idea of turning TypeVarLikes into essentially a descriptor on the generic and making the bound value easily accessible, although I think the change would have to be on the original generic base class and not just the GenericAlias.

I think it would be very surprising to get a runtime error in a class if you didn’t subscript, because the implementation was internally trying to access the value of the type vars, maybe that’s why you are referring to PEP 696? But PEP696 would not solve that issue, because you still will not get a GenericAlias unless you actually subscript the class.

Have you given any thought as to how you would deal with forward references? Is it the individual class implementation’s responsibility to deal with them? How about special form types, such as Union and Literal? I think we’re currently quite limited when it comes to actually making use of the bound type of a TypeVarLike at runtime, because we can’t just go self.T.__value__() and expect that to work. The only really simple case we can deal with is if there is no forward reference and it’s just a simple subclass of type, but even then you would probably want to set something like a bound=Callable[[], object] on the TypeVar if you wanted to be able to create an instance of that type, but then static analysis would still allow to pass things like Any/Union etc, which kind of defeats the purpose of static analysis.

So I think it would be very helpful at this point if you could set a flag on a TypeVarLike so type checkers will emit an error if you try to create an instance of a class when it’s bound to a special form type, rather than a real type. So you can catch improper use of these runtime-evaluating generics in static analysis, rather than at runtime.

Gobot1234 · November 1, 2023, 11:00am

I agree that it’s annoying that there are runtime errors but I can’t really do anything if you aren’t using the mechanism. In an ideal world there would be a way to opt-in to this as well so your type checker can tell you that you’re going to encounter type errors because you aren’t using the type at runtime.

On this topic maybe this is something that could be added as a new soft keyword or something on specific type parameters like runtime?

class Foo[runtime T: int]: ...

though I’m not a huge fan. The other option is in your own project setting a flag in your type checker than enforces this for all parameters.

In 3.13+ it shouldn’t be too much of an issue because things in general shouldn’t need inner strings and hence strings shouldn’t be sneaking into places unintentionally. Because there is a chance that strings change their meaning in annotations to be literal strings at some point I just wanted to have .__value__ return the string. This should be the case with everything, __value__ should return everything unchanged, if the user wants to call the return, they can, but the user can choose to do whatever they want with them.

I didn’t make this clear in the original post but I don’t think there’s any chance this should be actually checking types at runtime, so checking for something being compatible with the bound isn’t part of this PEP and is still left to type checkers.

Daverball · November 1, 2023, 12:43pm

From what I’ve seen so far things like PEP649 would only apply to actual annotations. I don’t think there’s really a backwards compatible way to defer evaluation of binding a generic to a type, since it’s just a regular subscript operation, it would have to be its own special operation if you wanted to be able to defer the evaluation and prevent string forward references from sneaking in. Unless there’s plans to add explicit forward references in 3.13?

In most runtime use-cases it probably doesn’t matter, since the forward reference probably cannot be resolved anyways if you immediately create an instance of the type, but this could still potentially be problematic when subclassing generics, since it’s a little more common to have circular dependencies between classes that can only be resolved once the class body has finished executing.

I suppose one potential workaround would be a 3.12 style type alias, but then your implementation would need to be able to unwrap the type alias. Either way I think it’s at least worth thinking about how to overcome these limitations, to make runtime use of TypeVarLikes as intuitive as possible and avoid potential surprises compared to code that doesn’t make use of it.

Yes I agree, that’s why I think it is important to think about how this interacts with static type checkers and where they currently can and cannot help us write correct programs with generics that want to use their bound types at runtime.

We already run into problems with things like Pydantic and SQLAlchemy emitting runtime errors due to annotations, that aren’t currently caught by type checkers. PEP649 should improve things by making unresolved forward references less likely, but we still have no way to mandate that a type must be available at runtime.

Gobot1234 · November 2, 2023, 10:35pm

Thats an interesting idea though would probably require too much static analysis for the compiler. Though it should be more possible for subclasses as you’ve said as those are more knowable at compile time/we can do some symtable magic.
Me from the future: after coming back to this, it is more likely than not that the cases that aren’t entirely knowable aren’t going to be using type checking (imagine dynamically created types) so this actually might prove very useful if possible.

Yeah, I have a undecided section for this but the problem is more that they currently aren’t callable which breaks symmetry with old type aliases and IMO is the best behaviour.

Daverball · November 3, 2023, 6:49am

With explicit forward references I was thinking more of new syntax, something like the new type alias syntax where the evaluation is deferred until you actually access the __value__ attribute of the resulting TypeAliasType object, just for a single name (or expression) that replaces its deferred value with itself once it’s accessed for the first time, rather than needing to be explicitly unpacked.

That being said, deferred expressions is a thing that has come up in contexts other than typing as well, so it probably makes more sense as a general construct, rather than being only used as an explicit forward reference, although it would probably be weird to look at things like:

class Foo(Bar[defer Baz]): ...

And it would require that __class_getitem__ only directly stores the deferred expression, without doing anything with its value.

I don’t think it’s statically knowable which runtime subscript operations could be treated differently, even with subclassing it would be perfectly legal to use something other than a generic as long as the subscript operation returns a type. Imagine e.g. a dictionary that contains types. Or also just any class that uses __class_getitem__ to do something other than binding a type var. While either would be a fairly esoteric thing to do, it is legal and it shouldn’t break. And remember that anyone can monkeypatch these symbols at runtime to do something different, so it needs to remain fully dynamic.

NCPlayz · November 4, 2023, 3:44pm

Currently when debugging it’s a bit hard to tell from which class a TypeVar comes from if you’re subclassing since the TypeVars are typically named the same. Is there any plan to change the display name to be <cls>.T as well? I think pyright currently shows T@cls but since that would be pretty weird to implement in Python and this PEP plans to implement access to <cls>.T I think it makes sense for the display names to be so as well.

Gobot1234 · November 5, 2023, 3:07pm

In my mind I wouldn’t want it to be a explicit bit of syntax I’d like it to just be automatically handled for you (type aliases don’t need to include these things so it’d be nice if it just worked), I’m not sure how this would work but I do think it should be possible with PEP 649.

I also completely agree on this but I don’t think those cases would need to support deferred evaluation in a similar way so I don’t think it’s the end of the world if they don’t work. I’d like to support this in a small subset of the possible bits of syntax for class definitions that actually can be useful at type checking time because this feature wouldn’t be useful outside of then.

Just going to drop how Kotlin handles needing to know if a type param should be used at runtime Inline functions | Kotlin Documentation. It has a reified keyword (which would probably need to be soft) which could indicate that a type/function requires subscription before being called.

Daverball · November 5, 2023, 3:39pm

PEP649 only changes __annotations__. So it would not affect binding a generic, since that is implemented using __class_getitem__. It would also not work for old style type aliases, since those can’t be deferred much in the same way, which is also part of the rationale for the new type alias syntax, since that can be unambiguously deferred, the same is true for the PEP695 type parameter syntax, the bound/default expression can be deferred on those type vars, since they’re completely new AST nodes without any backwards compatibility baggage.

I don’t think it’s an issue that can be solved to a reasonable degree without new syntax. PEP649 will improve things a lot, but it will not help as much with generics at runtime by comparison. i.e. it’ll help with expressions like:

def foo[T](bar: T) -> T: ...
class Foo[T]:
    x: T
type Bar[T] = list[T]
y: Foo[int]

but not with expressions like

a = foo[int](5)
b = Foo[int]()
c = Bar[int]

class Baz(Foo[int]): ...

Since they’re statically indistinguishable from regular runtime subscript operations. That’s where introducing new syntax like <> instead of reusing [] and runtime hacks would have made designing things to be both useful for static analysis and runtime introspection a lot simpler, but at least for now we’re stuck with this. So deferred expressions or piggy backing on a new style type alias seems to be the only somewhat reliable way to do this right now, even after PEP649.