PEP 695: Type Parameter Syntax

tmk · February 4, 2023, 3:04pm

I’m a bit concerned that the proposed syntax seems to preclude the possibility of ever adding something like a TypeVarDict to Python (because the syntax **Tdict is being taken by ParamSpec). I think you said before that there doesn’t seem to be a use case for TypeVarDict, but what about something like a pandas dataframe? Each column doesn’t only have its own type – it also has its own name:

>>> df = pd.DataFrame(data={"col1": [1.5, 2.3], "col2": [True, False]})
>>> df.dtypes
col1    float64
col2       bool
dtype: object
>>> df.loc[:, "col1"]
0    1.5
1    2.3
Name: col1, dtype: float64

So, it seems to me like if you really want to model this with types, you need a TypeVarDict. It might look something like this (using the Map operator that was originally part of PEP 646; Map[List, (int, bool)]==(List[int], List[bool])):

Tdict = TypeVarDict("Tdict")

class DataFrame[**Tdict]:
    def __init__(self, data: Map[List, Tdict]): ...
    @property
    def loc(self) -> Map[Series, Tdict]: ...

Cols = TypedDict("Cols", {"col1": numpy.float64, "col2": bool})

df: DataFrame[Cols] = DataFrame(
        data={"col1": [1.5, 2.3], "col2": [True, False]})
col1: Series[numpy.float64] = df.loc["col1"]

I’m not necessarily saying it’s a good idea, but why not allow the future possibility?

AlexWaygood · February 4, 2023, 3:24pm

This speaks to a general concern I have about this PEP. I like the syntax proposals overall, and I’m supportive of Python eventually adopting syntax like this for its typing system. But Python’s typing system is still evolving rapidly — I don’t feel like I have any ability to predict which new features might be proposed and/or adopted for Python 3.13. It’s currently pretty easy for us to add new features to the typing system, but adopting this PEP raises the possibility that a lot more typing features might require syntax changes in order to be well integrated into the existing typing system. We’ve already seen examples of this with PEP 696, where one of the questions asked has been “What would the proposed syntax look like if PEP 695 were adopted?”

I feel like I’d prefer to see Python’s typing system stabilise somewhat before we enshrine the current system (or something like it) in a series of syntax changes that will be very hard to reverse or change in the future.

a-reich · February 4, 2023, 4:48pm

It would be amazing to properly support column-level typing for pandas, which is one of the most heavily used python libraries (and there are several others with similar data structures, eg pyarrow, cuDF, dask, polars…).
That said, IMO it’s not clear yet that the approach you describe is the only natural or best one. There might be others that work well without the specific **TypeVarDict syntax. I can think of simple workarounds, like subtyping the DataFrame class with one that has a TypedDict to keep track of types for each key.

erictraut · February 4, 2023, 5:31pm

Early proposals for this PEP were discussed in a series of “typing meetups”, and the topic of future extensibility was brought up during those discussions. The proposal in the PEP attempts to retain flexibility for future expansion.

The PEP also includes a survey of generic support in other other programming languages. This was done in part to anticipate potential future extensions in the Python type system. For example, PEP 696 (default TypeVar values) was anticipated through this exercise, and the proposed PEP 695 syntax accommodates default TypeVar values in a natural way.

It’s also important to note that this PEP doesn’t deprecate the current mechanisms for defining TypeVars or generic classes, functions, and type aliases. Those mechanisms can live side by side with the new syntax. If you find that those mechanisms are more flexible for exploring extensions to the type system, PEP 695 does not prevent you from leveraging these existing mechanisms.

Alex raised a general concern about waiting for the Python type system to stabilize. I think we can probably agree that the type functionality used for the most common uses cases is pretty well baked at this time. Newer PEPs (like 646 and 696) are filling in gaps for increasingly esoteric and specialized use cases. New explorations and innovations should continue, but there are parts of the type system that are well-baked and have widespread usage. I think we should strive to make these parts easier to use and accessible to more Python developers. That’s what PEP 695 is trying to achieve.

sterliakov · March 12, 2023, 2:00am

I understand that I’m very late to the party, but some time ago Alex Waygood told me that all opinions count for PEPs like this, so please let me try.

To begin with, my attitude towards this proposal is very negative. The main reason “for myself” is that I hate overloading python with typing-specific syntax. It’s very typescript-ish, and I personally dislike how TypeScript programs look and feel. Both type declarations and generic functions syntax look very unpleasant to my eyes. Again, this is just my opinion, please don’t treat it as a sign of disrespect or something like that - and this doesn’t count anyway, I guess.

This PEP doesn’t violate PEP-484’s promise not to make typing “mandatory, even by convention” - but goes in that direction. The more typing-specific syntax we add, the higher the chance that people will treat annotations as a necessity, even in one-time throwaway scripts that don’t benefit from typing at all.

However, I have also noticed a few possible usability issues:

Difficulties for newcomers: currently the typing-specific syntax is limited to annotations, a few dunders (like __class_getitem__ and __annotations__ - but dunders are already part of python core design) and an asterisk for Unpack in py3.11+ (which fits existing syntax very well). Everything else still fits the overall language model, and MyType: TypeAlias = list[str | int] is at least understandable for newcomers as “create some variable with a value of some type” - no matter how it can be used later. type MyType = list[str | int] is not that easy, especially for those who learn python as their first language. def fn[_T](x: _T) -> _T will be even more surprising, if one meets it preparing for algorithms exam. It will become harder to learn by reading real production code because the required entry level will be higher.
Loss of TypeVar semantic meaning. In some cases modules can define a type variable with a descriptive name and reuse it (see django-stubs). Such type variable can be documented (even with a docstring in a common format supported by Sphinx+Napoleon and other doc processors), explaining its usage and meaning within the package. If that type variable is referred to by some mypy plugin, documentation is even more important: it can explain what substitutions happen on behalf of that plugin. Such usage is supported by the ecosystem: here’s a sphinx plugin targeting specifically type variables documentation. I see that current TypeVar is not being deprecated, but I’d suppose that projects will prefer to stick to one of the possible spellings to avoid compatibility issues (see last bullet).
type usage. type is already a builtin with several context-dependent meanings. It can be type(smth) to get a __class__, type(name, bases, attrs) to create a new type, and something: type or something: type[SomeThing] in annotations (the distinction between bare and parametrized type is a separate problem). Adding a fourth role of a soft keyword to it can become a large source of confusion. Also, despite it being builtin, many (many!) projects are using it as a variable name, including mypy itself.
Infrastructure support. PEP484 support and PEP3107 annotations were wanted by the whole community, because there was no reasonable way to annotate variables and arguments before - and thus they got quick tooling support. PEP695 doesn’t seem to attract similar attention (please adjust if I’m wrong here), and this makes me suspect that tooling support will not come very soon. Type checkers will have to support this, of course, but I’m sure that most flake8 plugins or sphinx extensions won’t adhere quickly. This means that many early adopters will give up without the ability to rely on the infrastructure they’re used to.
Extending “generic function” concept (this looks a bit scary to me). Many languages have a concept of generic function such that it can be called as fn[int](x). In Rust, for example, such functions are used a lot for conversions (like std::convert::Into) and some other scenarios. However, the type system after PEP695 still forbids this - and I bet that many people will begin using def fn[T](x: int) -> T to achieve something similar. Currently functions do not have a special indication of being generic, and such mistake is more difficult to make.
There should be one-- and preferably only one --obvious way - and now both TypeVar and [T] will coexist, and people will have to know that these forms are not fully compatible. I still see people use NoReturn as a return type annotation for functions without explicit return statement - will the presence of “explicit” and “implicit” type variables be easier to understand? I don’t think so.

Thank you for your time.

jcgoble3 · March 27, 2023, 1:29pm

As a mostly-lurker (brought here by Thomas’s call for input), here’s my two cents at a high level: it’s clean and simple, and the proposed syntax is unlikely to ever be used for anything else.

As for the “one obvious way to do it” complaint immediately above, I would argue this proposal is the obvious way – it closely parallels how type variables and generics are done in many other languages, and frankly when I started doing static typing in my own code I was surprised that this wasn’t supported, and found the idea of subclassing Generic and manually instantiating TypeVars to be a kludge.

Basically, it just makes sense to me. I haven’t drilled down into details, but the overall syntax idea is a +1 from me.

dumbpotato · March 27, 2023, 2:17pm

This is what matters to me the most here. Having used Typescript(and I know the comparison isn’t fair) the code you have to write to get to even a simple generic function has a lot of boilerplate and I always found it unnecessarily complex and verbose.

thomas · March 27, 2023, 2:19pm

(Posting as myself, not for the rest of the Steering Council.)

From the “Runtime Type Alias Class” section:

At runtime, a type statement will generate an instance of typing.TypeAliasType. This class represents the type. Its attributes include:

__name__ is a str representing the name of the type alias

__parameters__ is a tuple of TypeVar, TypeVarTuple, or ParamSpec objects that parameterize the type alias if it is generic

__value__ is the evaluated value of the type alias

Why are these dunder attributes? They aren’t going to collide with anything else on the object, and they’re not special hooks for the interpreter. They’re just normal data for TypeAlias objects. (The same goes for __infer_variance__ on TypeVar proposed later in the PEP, but not the proposed __type_variables__ attribute on classes/functions/typealiases.)

zware · March 27, 2023, 2:22pm

Just to give my $0.02 as an experienced Python developer who has not yet even tried to get into typing: the given examples barely look like Python to me, and at first glance I have no idea what they do. That said, they also look more legible than the “before” examples in the PEP. I suppose that means that if this is a significant need in the typing world the proposed solution is probably fine, but I’m not looking forward to having to dive into some thoroughly-typed codebase to try to make sense of what it’s doing.

steve.dower · March 27, 2023, 3:02pm

I have much the same position as @zware, with the additional background that I have dived into typing a bit, and work with a lot of people who assume that it is required to be correct (or idiomatic) Python.

I’m not looking forward to having to work through code that tries to use these extensively, and I am fully expectant that I will have to work through such code in order to provide feedback on readability, maintainability, and correctness. Since there are different approaches available to achieve this goal, authors will get to choose the balance, but I can’t see a way that anyone finds these easier to read without also being fairly experienced at typing, and that feels like a loss.^[1]

Compared to explicitly named type aliases, which are definitely harder to write and maintain, but I think serve the reader better than integrating the type straight into the signature. ↩︎

Jelle · March 27, 2023, 3:24pm

Using dunders for the new TypeVar attribute is consistent with existing TypeVar attributes like __covariant__. I don’t really care which way we go, but it’s good to remain consistent within one object, and removing the underscores would cause needless churn.

Jelle · March 27, 2023, 3:27pm

@thomas I just saw Community consensus on PEP 695. I feel the post would be better if it also showed the existing syntax to do the same thing (with T = TypeVar("T")). Some people here are writing that this syntax would look unfamiliar to them, which is fair, but I wonder if they would have the same reaction to the current Generic[T] syntax.

malemburg · March 27, 2023, 3:35pm

I second @zware and @steve.dower in that I’m not really looking forward to having to read such code, but when I need to, I’d much prefer the angular bracket notation used in many other languages to the proposed square bracket notation which looks too much like the subscript notation in Python.

With the angular brackets, it would be immediately clear, that whatever is in those brackets has a special meaning, which is definitely new to Python and requires special knowledge to be understood.

Examples:

class ClassA<T>:
    def method1(self) -> T:
        ...

def func<T>(a: T, b: T) -> T:
    ...

The angular bracket notation has long meant “template” in other languages (IIRC, C++ was the first to introduce this concept) and will also be better understood by programmers coming from those languages to Python.

I don’t quite follow the reasoning in the PEP on why not to use angular brackets…

The parser issue can surely be resolved by running .strip() on whatever is defined in the angular brackets.
The argument about it being “confusing” to use angular brackets for declarations vs. square brackets for specialization seems to be missing some context - at least for me to understand it. A declaration (declaring something to be used later) is something completely different than a specialization (narrowing down choices), so it’s quite natural and indeed useful that the syntax used for these two concepts is different as well.

pf_moore · March 27, 2023, 8:20pm

I agree with pretty much everything @zware @steve.dower and @malemburg have said.

In addition, I find the PEP itself to be highly technical and extremely difficult for someone who’s not a typing specialist to follow (which probably explains at least some of the lack of feedback). I note that there’s no “how to teach this” section in the PEP - I think it would be very informative to see what the user-facing documentation of this feature is expected to look like. If it can’t be explained in terms the typical Python programmer will understand, that suggests it needs something doing to it to make it achieve the benefits it claims - according to the PEP itself, this is potentially going to be used in 14% of modules using typing.

ambv · March 28, 2023, 3:18pm

I refrained from posting so far because despite being a big static typing enthusiast, I’m not crazy about the chosen syntax here. I really don’t want to derail progress on typing, and this issue in particular, but I’m afraid the overloading of square brackets has gone too far in this case, I believe.

Using triangle brackets, suggested by @malemburg, makes it read better to me as the new construct at least makes it evident that this is special “templating” syntax. However, templating in C++ is effectively context-dependent code generation to handle multiple types. That’s not what type variables are for in Python. So, the analogy only goes so far there, too, and in the end, I don’t think using triangle brackets solves the issue.

The core issue for me, I have to conclude, is the rather unprecedented density of information that would end up in a function and/or class signature. For more complex signatures this would inevitably have to be split over many lines, and at that point, it’s no better than using a separate type variable declaration. My intuition here is that sparse is better than dense.

However, I do love the type soft keyword in this PEP. I think it’s comparatively uncontroversial and could be used for more than just unions, as presented in the PEP:

# A non-generic type alias
type IntOrStr = int | str

# A generic type alias
type ListOrSet[T] = list[T] | set[T]

# A type alias that includes a forward reference
type AnimalOrVegetable = Animal | "Vegetable"

# A generic self-referential type alias
type RecursiveList[T] = T | list[RecursiveList[T]]

I wonder if it wouldn’t be enough if this syntax could also handle defining all TypeVarLikes. Would that constitute enough of formal syntax for generics that the PEP is calling for?

ajoino · March 28, 2023, 7:39pm

I think the way the soft keyword type is described in the PEP looks out of place in Python.
All other occurrences of keyword statement/expression that I can think of don’t have a =, like

def foo(...):
    ...

class Bar:
    ...

try:
    ...
except BazError as baz_err:
    ...

While the first two examples are dissimilar to the proposed syntax, the last example is perhaps more relevant.
Furthermore, in the PEP it’s written that the new type soft keyword is “[s]imilar to class and def statements […]”, which is only half true IMO since neither of those involve =.
I think it would make the proposal more in-line with other Python statements if the = is dropped and some other syntax is used. Two alternative syntaxes, using @ambv’s examples

type IntOrStr: int | str

type list[T] | set[T] as ListOrSet[T]

(I went over the thread again and saw the second example of mine was proposed by @tmk here.)

I haven’t thought too hard about it, and I’m very sure I would come to accept the proposed syntax quickly, but I didn’t see the choice of syntax discussed much in the PEP nor in this thread beyond the post I mentioned earlier and one response to it.

To the PEP author @erictraut is there a particular reason you went with using =, and likewise is there a particular reason why you decided not to use a =-free syntax?

Jelle · March 28, 2023, 7:45pm

A type alias definition like type X = int | str is naturally read as “the type X is int or str”, which maps nicely to PEP 695’s syntax. I would read type X: int | str as “type X is of type int or str”, which isn’t quite correct.

Other languages such as TypeScript and Haskell also use this syntax, reinforcing the point that it’s natural to use = here.

ntessore · March 28, 2023, 7:51pm

Seeing the type X = Y syntax, I just had the most amazing idea; what if we also wrote all other type annotations as int X = Y …

PS: On a more serious note, it should be

X: type = int | str

ajoino · March 28, 2023, 7:56pm

Yeah I agree that first example syntax I used is strange, but the second one with as I think looks nicer in Python than the proposed syntax in the PEP. Do you have any thoughts on about using as instead of =?

ambv · March 28, 2023, 8:21pm

I don’t think using a = in the same expression as a keyword is somehow unpythonic. That ship has sailed with assignment expressions. But even before that we’ve had things like:

a = yield b
x = lambda: None

We can definitely bikeshed the type statement syntax but I’m more interested in hearing a retort to my (and others’) reservations.