Pre-PEP Considerations and Feedback: Type Transformations on Variadic Generics

Preface

Before I begin, please note that this is the first time that I have considered and researched the process of proposing a PEP. The purpose of this post is to get some feedback from the community on a PEP I plan to propose. The format of this post does not necessarily represent the final format of the PEP, and, based upon other PEPs I have viewed, this PEP (at least in its current state) is quite bare-bones. Please read through this whole post before asking questions or giving feedback. The work-in-progress title of this pep is: Type Transformations on Variadic Generics [1].

Abstract

This PEP, which is a follow-up to PEP 646, would add specifications to allow for type transformations/mappings on variadic generics (typing.TypeVarTuple).

Motivation

This is something that I had originally believed to already be a part of PEP 646. It was only until recently that I found out that, as noted by Pradeep Srinivasan (one of the authors of PEP 646), this was excluded from the aforementioned PEP due to the level of complexity that the PEP had already reached. In the aforementioned notation by Srinivasan, it was mentioned that plans were in place to release a follow-up PEP that incorporated these changes by late 2022 - although it appears as though that never ended up happening [2].

This PEP would allow for transformations similar to those available to typing.TypeVar, such as below:

def foo[T](x: Type[T]) -> T: ...

to be available for variadic generics as well:

def foo[*Ts](*x: *Type[Ts]) -> Tuple[*Ts]: ...

Examples

NOTE: The syntax used in these examples is not concrete. I will go over the proposed syntax later.

Below is a condensed version of the builtins.map stub in Python 3.12:

class map(Iterator[_S]):
    @overload
    def __new__(
        cls,
        func: Callable[[_T1], _S],
        iter1: Iterable[_T1],
        /
    ) -> Self: ...
    @overload
    def __new__(
        cls,
        func: Callable[[_T1, _T2], _S],
        iter1: Iterable[_T1],
        iter2: Iterable[_T2],
        /
    ) -> Self: ...
    # ...and so on
    @overload
    def __new__(
        cls,
        func: Callable[..., _S],
        iter1: Iterable[Any],
        iter2: Iterable[Any],
        iter3: Iterable[Any],
        iter4: Iterable[Any],
        iter5: Iterable[Any],
        iter6: Iterable[Any],
        /,
        *iterables: Iterable[Any],
    ) -> Self: ...
    def __iter__(self) -> Self: ...
    def __next__(self) -> _S: ...

This could, as mentioned by Randolf Scholz, be more accurately (and concisely) typed using the following:

class map[F: Callable[[Ts], S], *Ts]:
    @overload
    def __new__(cls, func: F, /) -> Never: ...  # need at least 1 iterable.
    @overload
    def __new__(cls, func: F, /, *iterables: *Iterable[Ts]) -> Self: ...
    def __iter__(self) -> Self: ...
    def __next__(self) -> S: ...

Not only is this much less cumbersome than before, but it also allows for any number of types to be used. I.e. it is not limited by the number of __new__ overloads you wish to write.

This is just one of many examples. Links to more examples are listed below:

Specification

Similar to how type transformations can be applied to non-variadic type variables, type transformations will now be applicable to variadic generics, so long as all transformations are performed before the unpack operation. For example, the following function using variadic type transformations:

def foo[*Ts](*args: *Type[Ts]) -> Tuple[*Ts]: ...

when called with a number of parameters (for this example, str, int, and float):

foo(str, int, float)

would be understood to have a return type of Tuple[str, int, float]. Without this type transformation, i.e.:

def foo[*Ts](*args: *Ts) -> Tuple[*Ts]: ...

and, when called with the same values as before, the return type would be understood to be Tuple[Type[str], Type[int], Type[float]], as expected given the above function definition.

More formally, for any parameterized type T, the following holds true:

  1. If T accepts exactly one generic parameter, then T[Ps] (where Ps is a variadic generic) can be understood to contain the types T[Psā‚], T[Psā‚‚], ā€¦, T[Psā‚™].

  2. If T accepts more than one generic parameter, then T[As, Bs, ..., Ns] (where As, Bs, and Ns are variadic generics) can be understood to contain the types T[Asā‚, Bsā‚, ..., Nsā‚], T[Asā‚‚, Bsā‚‚, ..., Nsā‚‚], ā€¦, T[Asā‚™, Bsā‚™, ..., Nsā‚™]. Additionally, the length of all passed variadic generics (As, Bs, ā€¦, Ns) must be equal.

  3. When both generics and variadic generics are mixed, all non-variadic generics will be reused as many times as needed, such that any given non-variadic generic X is equivalent to a variadic generic Xs that contains X as many times as is needed. That is, if T accepts both variadic generics and non-variadic generics, then T[As, Bs, ..., Ns, X, ..., Z] (where As, Bs, and Ns are variadic generics, and X and Z are non-variadic generics) can be understood to contain the types T[Asā‚, Bsā‚, ..., Nsā‚, X, ..., Z], T[Asā‚‚, Bsā‚‚, ..., Nsā‚‚, X, ..., Z], ā€¦, T[Asā‚™, Bsā‚™, ..., Nsā‚™, X, ..., Z]

Backwards Compatibility

For backwards compatibility, the Map [3] object will be added to the typing_extensions library. The structure of how this type will be used is not concrete, so here are three options that I have seen in other peopleā€™s examples:

NOTE: in all examples below, T is a generic class that accepts the required number of types, Us and Vs are both variadic generics, and X is a non-variadic generic.

    • Single variadic generic usage: Map[T[Us]] == T[Us]
    • Multiple variadic generic usage: Map[T[Us, Vs]] == T[Us, Vs]
    • Mixed generic usage: Map[T[Us, Vs, X]] == T[Us, Vs, X]
    • Single variadic generic usage: Map[T[Us], Us] == T[Us]
    • Multiple variadic generic usage: Map[T[Us, Vs], Us, Vs] == T[Us, Vs]
    • Mixed generic usage: Map[T[Us, Vs, X], Us, Vs, X] == T[Us, Vs, X]
    • Single variadic generic usage: Map[T[u], u in Us] == T[Us]
    • Multiple variadic generic usage: Map[T[u, v], u in Us, v in Vs] == T[Us, Vs]
    • Mixed generic usage: Map[T[u, v, X], u in Us, v in Vs] == T[Us, Vs, X]

Notes

Final Questions and Feedback

As this is a draft PEP discussion, all feedback and criticism is welcome. Please be kind when discussing your ideas with others. If possible, below is a list of aspects I would like feedback on. Additional feedback is welcome as well.

  1. Notes 1, 2 and 3.
  2. In a sub-section of the PEP 646 Specification, it is noted that "[...]type variable tuples must always be used unpacked (that is, prefixed by the star operator)." Does this, by definition, make the specification of this PEP (at least when given the proposed syntax) impossible?
  3. In the Backwards Compatibility section, which listed (or unlisted) option do you believe is the most viable and/or easy to use and understand?
  4. Is the formal specification at the end of the Specification section understandable?

Thank you for reading this far!


  1. As noted, this is a work-in-progress title; feedback is welcome. I have also seen transformations and mappings used interchangeably. ā†©ļøŽ

  2. If I am mistaken, and this follow-up PEP does already exist, please let me know. ā†©ļøŽ

  3. This name is not concrete. Other options are welcome. ā†©ļøŽ

8 Likes

TypeVarTuple Map PEP draft - Google Docs is proto-pep for Map written few years ago. It has not been driven forward in long period so I think itā€™s reasonable to revive it, but if you are writing new PEP, Iā€™d still recommend reviewing old draft and see how you differ.

edit: Main feedback is should this also include Paramspecs? Similar to needing to Map a type operator over typevartuple, paramspec have similar patterns of needing a Map thatā€™s come up occasionally over the years

Interesting; I didnā€™t realize this existed, so I will take a look! I will also look into how this might apply to ParamSpecs.

Your proposed syntax is definitely more elegant than the Map[Iterable, Ts] syntax that was previously proposed. I wonder if your syntax is sometimes ambiguous or less powerful? I canā€™t think of an example though.

Well, I guess the following is an example where you have to wrap the new syntax into an additional tuple[...]:

# previously proposed map syntax
def f[*Ts](*args: *Ts) -> Map[list, Ts]:
    return tuple([e] for e in args]

# star syntax
def f[*Ts](*args: *Ts) -> tuple[*list[Ts]]:
    return tuple([e] for e in args]

but that seems fine.

Although the former is more compact, there are three primary reasons I decided not to go with that format:

  1. I wanted this to work similarly to Type[T], as it will help people get used to it.
  2. Returning a typing.TypeVarTuple itself is not possible in the first place (by specification), it must be unwrapped (e.g., Tuple[*Ts]), so it wouldnā€™t make sense to change that format for this.
  3. Binding the types through Map works fine for single-parameter generics (such as List), but for more complex types it can cause problems. For example, with *Callable[[Ts, Us], Vs], using the previous syntax (Map[Callable, Ts, Us, Vs], or however the exact syntax was proposed, I forget) is quite hard to read, as it is unclear where those generics apply to the type (Callable). This could also possibly cause actual issues for certain types (e.g. those with multiple arguments that accept variadic generics). This is the reason why all of the potential formats for the backwards compatibility type require the target type to be subscripted.

As I understand it, unpacking like this rather than using Map[...] wonā€™t work for introspection at runtime.

1 Like

Thank you for writing up this proposal. I was toying around with something similar in my head and Iā€™m happy to see that we mostly came up with the same rules for mapping multiple variadic generics and mixed cases.

I like the terseness of the proposed syntax, although I do have some concerns about runtime introspection if we pick 1. or 3. for backwards compatibility. I think thereā€™s also a fourth option for backwards compatibility, that looks like 2. except it doesnā€™t have the redundant parametrization of the generic, you can still support things like Callable that takes a ParamSpec by requiring to pass a list of type parameters for any ParamSpec parameter, so the same rules you have for a valid parametrization of the generic itself, apply to parametrizing Map: Map[Callable, [Ts, Us], Vs] is valid only if Callable[[T, U], V] is valid.

So that would be my preferred, and most natural, way to spell this both for backwards compatibility and runtime introspection. Generator expressions are unfortunately pretty difficult to inspect at runtime and would also require a syntax extension that will probably be difficult to push through[1], if it was supposed to do something typing specific, because then T[u in Us] and T[(u in Us)] would do completely different things, which I donā€™t think is a good place to be.

I would also like to see a CPython proof-of-concept implementation of the new syntax to validate that itā€™s actually possible to unambiguously generate Map instead of Unpack everywhere, I suppose the parser will need to check whether or not the starred expression is subscripted, if itā€™s not then it should be Unpack and if it is then it should be Map. I would be a little concerned however how this would behave for the in-between versions, that support * for Unpack. Is it possible those versions would accept some map expressions but interpret them as Unpack? That would be pretty bad, if it were the case. Iā€™m also not sure if that rule would still work if higher kinded types ever became a thing.


As for generalizing this to be usable with ParamSpec: I have already thought about this to some degree and there are a couple of rules I came up with, that appear to be consistent:

  1. You can map over ParamSpecArgs or ParamSpecKwargs in addition to ParamSpec, but the expressions for *args and **kwargs need to be identical, safe for their respective use of .args and .kwargs. Itā€™s also only allowed inside a function that either returns a generic using P or where there is already a bound P in scope, so the same rules apply for where itā€™s valid to use P.args and P.kwargs, just with the addition that you are allowed to map over them as long as both mapping expressions are the same[2].
  2. You can use Concatenate to support a mixed use-case between ParamSpec and a regular TypeVar, but you are not allowed to mix with TypeVarTuple because thereā€™s no unambiguous way to iterate over the two in conjunction.
  3. You cannot use more than one ParamSpec inside a single Map expression.[3]
  4. A ParamSpec[4] can only be provided in a position where the generic itself would already accept a ParamSpec, whereas ParamSpecArgs and ParamSpecKwargs can be used freely, safe for the restrictions in 1.

Maybe these rules, as half-baked as they currently are, will help you with spelling out some rules of your own.


  1. T[u in Us] is not valid syntax and it is even less so inside a tuple literal ā†©ļøŽ

  2. it feels a bit clumsy having to repeat the same mapping expression twice, but the only other way I could think of, doesnā€™t seem much better. The other thing I considered was adding args and kwargs attributes to Map so you could move the expression to a type alias, this would be a bit more clumsy, but syntactically it doesnā€™t work as nicely ā†©ļøŽ

  3. This rule is mostly there to keep things simple. You could allow multiple ParamSpec as long as they share the exact same number and kinds of arguments, but I canā€™t really think of a use-case where this would be valuable ā†©ļøŽ

  4. potentially wrapped with Concatenate ā†©ļøŽ

1 Like

You could solve the runtime introspection concern with implementing __iter__ for GenericAlias, but this is impossible to be backported. Iā€™d rather this be implemented as Map which can exist in typing_extensions, and later gain syntax sugar in the future if itā€™s found to be worth it.

Iā€™m looking forward to progress on this feature being picked up by someone again, but I share concerns with people above about how it happens.

1 Like

This would be pretty great!

Over the years Iā€™ve had a bunch of use cases that would have benefited from something like this.

A few examples that come to mind:

gather()

The gather() implementation (both in asyncio and quattro) needs 6 overloads to cover base cases. Being able to take *Coroutine[Any, Any, Ts] and return an Awaitable[*Ts] would be great.

attrs.Attribute[T]

Over the years Iā€™ve implemented a number of database object document mappers using attrs (Iā€™d argue thatā€™s one of its main use cases, apart from reducing boilerplate).

Usually there are functions for fetching data that take *attrs.Attribute[Ts] and return tuple[*Ts]. Being able to type these without many overloads would, again, be great.

Thanks for the reply! Regarding the backwards compatibility format, I think yours actually makes a lot of sense. Personally, I think that adding ParamSpec support would be interesting, but is overall outside of the scope of this PEP. It would make more sense to me if we left room for a follow-up PEP that adds support for ParamSpec, but waited until the specification of this PEP to be accepted.

I am unfortunately not very well-versed in CPython, but if I have time I can look into it. I imagine for someone new to this (as I am) it may be pretty involved - especially if you donā€™t already know the ins and outs of it.

Indeed - there are a number of places even within Pythonā€™s own standard library where this could be quite useful.

Thanks for starting this thread. I think this is a promising start.

I much prefer this proposal to the earlier Map proposals. This is more readable, more flexible, and more composable.

Putting on my ā€œtype checker authorā€ hat for a moment, I have to say that this wonā€™t be an easy or cheap feature to implement, but I think itā€™s doable if thereā€™s sufficient value and demand.

One small concern is that error reporting will be challenging when an unpack operator is omitted. This is a common mistake. Today, such errors are straightforward to explain to developers because the unpack operator must immediately precede the TypeVarTuple. With this proposal, an unpack operator can appear quite far away from the TypeVarTuple(s) in an expression, so telling the developer how to address an error may be tricky.

Speaking of unpack operators, the proto spec doesnā€™t provide any updated rules about where unpack operators must appear ā€” and where they cannot appear. These rules were pretty straightforward in PEP 646, but they get a bit more complicated with this proposal. I think this should be spelled out in the spec.

The proto spec doesnā€™t say anything about the use of generic type aliases. I think the proposed syntax works fine with generic type aliases, but I recommend that you go through some examples and convince yourself that there are no problems here.

I agree that this PEP shouldnā€™t attempt to tackle the ParamSpec case, but itā€™s worth spending some cycles now to consider if/how the proposed mechanism could be extended for ParamSpec. Iā€™m somewhat skeptical that it can (or should) be, but Iā€™d be happy to be proven wrong here.

Iā€™m unclear on why thereā€™s a need for a backward compatibility mechanism. I think the proposed syntax is backward compatible (that is, it doesnā€™t require any grammar changes), and it doesnā€™t result in any runtime exceptions when I run the samples. A few folks mentioned issues with introspection, but Iā€™m not sure what problem they are referring to. Perhaps one of you can elaborate? Is the concern about runtime type checkers? They will need to be modified to know about the new mechanism anyway, so I donā€™t think thatā€™s an issue. If we can avoid adding a new Map special form for backward compatibility, thatā€™s definitely preferable. If there is a need for a backward compatibility mechanism (for reasons I donā€™t yet understand), then letā€™s try to reuse Unpack rather than introducing a new Map special form.

Your first example (the one where you redefine the class map using the new mechanism) is problematic because it defines a type parameter F in terms of other type parameters *Ts and S that are not yet defined. That wonā€™t work. However, this example can be fixed with a small change. Hereā€™s how I think it should look:

class map[*Ts, S]:
    @overload
    def __new__(cls, func: Callable[[], S], /) -> Never: ...
    @overload
    def __new__(cls, func: Callable[[*Ts], S], /, *iterables: *Iterable[Ts]) -> Self: ...
    def __iter__(self) -> Self: ...
    def __next__(self) -> S: ...

Iā€™ll note that in this example, the first overload is problematic because a return type of Never does not imply that the overload is invalid. It simply means that if itā€™s called, it wonā€™t return control back to the caller. There have been some discussions about providing a way for an overload to be marked ā€œillegalā€, but there has been no resolution to these discussion so far. Anyway, thatā€™s outside the scope of this proposal.

You provided links to several other example use cases. If you havenā€™t already done so, I recommend going through all of those examples and convincing yourself that the proposed mechanism addresses them all. If not, it would be good to understand why not.

My sense is that this PEPā€™s acceptance will ultimately hinge on whether we can collectively convince ourselves that it provides sufficient value. It wonā€™t be a cheap feature to add to existing type checkers, and it will likely take one to two years before it is implemented in a majority of the major type checkers. To justify this cost, it should add a commensurate value to the type system. If the main intent is to simply clean up the typeshed definitions for map and zip, then itā€™s not a good investment. If the feature is applicable to a much wider range of use cases, then it helps tip the scale in favor of acceptance. The PEP must make a compelling case for the addition of this feature. I donā€™t think the current draft makes a very strong case, so think about ways you can bolster it.

1 Like

It looks like it indeed already does work[1], but thatā€™s only for Python 3.11 and newer. 3.9 will still be around for another year and 3.10 for another two. The general process for typing changes, from what I understand, is that they first go into typing_extensions, so they can be used with any currently supported Python version as soon as the PEP is accepted.

Itā€™s also worth noting that what you get back is a types.GenericAlias for the unpacked type with __unpacked__ set to True. In contrast to what you get with *Ts which gives you back Unpack[Ts]. While this may be good enough at first glance[2] it does still pose backwards-compatibility challenges the other way, because we canā€™t construct the same object layout in pure Python, so now runtime analysis tools need to support both the backwards-compatibility object layout and what the parser currently generates. Not a blocking issue, but still worth taking into consideration.


  1. at least if you donā€™t want to support ParamSpec too, since it is not allowed to unpack for **kwargs ā†©ļøŽ

  2. although it is a little bit too easy to miss for my taste, since the standard typing.get_origin / typing.get_args wonā€™t clue you into anything special going on, if you get back an explicit Map or Unpack along the way thatā€™s definitely more user-friendly ā†©ļøŽ

Thanks for the response!

Speaking of unpack operators, the proto spec doesnā€™t provide any updated rules about where unpack operators must appear ā€” and where they cannot appear. These rules were pretty straightforward in PEP 646, but they get a bit more complicated with this proposal. I think this should be spelled out in the spec.

Agreed.

I agree that this PEP shouldnā€™t attempt to tackle the ParamSpec case, but itā€™s worth spending some cycles now to consider if/how the proposed mechanism could be extended for ParamSpec. Iā€™m somewhat skeptical that it can (or should) be, but Iā€™d be happy to be proven wrong here.

Agreed. I am also unsure whether or not it should be possible to transform ParamSpecs, but I will definitely look into ways in which it could be realistically specified, as leaving room for it in the future is a good idea regardless.

Iā€™m unclear on why thereā€™s a need for a backward compatibility mechanism.

Maybe I donā€™t quite understand, but how else could the mechanism be added to earlier versions (e.g. 3.9) without updating them?

If we can avoid adding a new Map special form for backward compatibility, thatā€™s definitely preferable. If there is a need for a backward compatibility mechanism (for reasons I donā€™t yet understand), then letā€™s try to reuse Unpack rather than introducing a new Map special form.

Agreed. I didnā€™t even consider simply adding this capability to Unpack, but since all TypeVarTuples must be unpacked anyway, it would make sense.

Your first example (the one where you redefine the class map using the new mechanism) is problematic because it defines a type parameter F in terms of other type parameters *Ts and S that are not yet defined.

Good catch. I generally write Python code for compatibility between all supported versions (i.e. 3.8+ as of the writing of this post), so I am still getting used to the new generic declaration syntax.

You provided links to several other example use cases. If you havenā€™t already done so, I recommend going through all of those examples and convincing yourself that the proposed mechanism addresses them all.

Good idea.

The PEP must make a compelling case for the addition of this feature. I donā€™t think the current draft makes a very strong case, so think about ways you can bolster it.

Agreed. I threw this together in a couple days and I havenā€™t yet had time to research other use cases, but its definitely on my radar.

Iā€™m fine with choice to leave paramspec out of scope, but I will add some examples on why this mechanism would be helpful for paramspec. This question is very good example and Iā€™ll review it.

There are several libraries that have decorators that take a user provided function that works on some python objects and lifts them to work on wrapped objects related to that library. The example in question comes from Ray a library for distributing functions across processes/machines.

The key code idea is,

import ray

@ray.remote
def do_things(x: str, y: int):
    return x * y

And decorated do_things can now be used with both original argument types but also remote versions that ideally would have signature looks like,

do_things.remote(x: str | Remote[str], y: int | Remote[int])

Here ray.remote Iā€™d want to be able to type hint like,

Lifted[T] = T | Remote[T]

def remote[**P, R](f: Callable[P, R]) -> Callable[Map[P, Lifted], R]

tensorflow has similar concept here which also takes a function and runs it with argument types lifted to original types + DistributedValues variants of them. There are few more examples in tensorflow of signature transformations but reviewing them the transformation looks tricky without other related type system extensions (some way to refer to type of another function).

One simpler case Iā€™ve also used is decorator that replaces a function to work on config file/string versions of itā€™s arguments. Rough idea looks like,

def serialize_func[**P, R](func: Callable[P, R]) -> Callable[Map[P, str], R]:
  def _new_func(*args: str, **kwargs: str):
    # If curious actual way this works is runtime type 
    # inspection to determine how to reconstruct each argument.
    deserialized_args = deserialize(args) 
    deserialized_kwargs = deserialize(kwargs)
    return func(*deserialized_args, *deserialized_kwargs)

 return  _new_func

At moment that decorator canā€™t maintain original signature at all and has to just drop it.

edit: My examples chose to use Map mainly as Iā€™m more used to thinking of it that way. Thereā€™s also not clear unpack like operator for paramspecs unlike typevartuple so Iā€™m unsure what,

def serialize_func[**P, R](func: Callable[P, R]) -> Callable[Map[P, str], R]:

could look like outside Map.

3 Likes