PEP 827: Type Manipulation

This is a discussion thread on PEP 827 – Type Manipulation.

The motivation for this PEP comes from a gap between Python and TypeScript: Python has an incredibly dynamic and powerful runtime, while TypeScript has an incredibly expressive and powerful type system. I’ve repeatedly found myself wishing I could lean more heavily on Python’s metaprogramming and overall expressiveness while still keeping APIs fully type-safe. Figuring out how to bridge this gap has been a year-long journey for @msullivan, @dnwpark, and I.

Please let us know what you think! The rendered PEP is here.

23 Likes

I love this! It will enable so many features in things I’ve built or wanted to build. :rocket:

I think it could improve a lot the developer experience for users of libraries that take advantage of this.

I can see several areas where I could use it in features for FastAPI, SQLModel, Typer… even things I haven’t been able to build because there was no way to give a good UX.

It also enables other libraries and ideas I’ve had, e.g. for ML libraries, distributed task systems (like Dask, Ray, Spark).

2 Likes

I’ve been contemplating a “enum transformation” PEP to have a dataclass_transform equivalent for enum typing support. But my suspicion is this PEP won’t negate the need for such a PEP due to:

  • Members[T]: produces a tuple of Member types describing the members (attributes and methods) of class or typed dict T.

In order to allow typechecking time and runtime evaluation to coincide more closely, only members with explicit type annotations are included.

Is that correct?

We use lots of TypeScript at work, and ‘man, I wish this existed in Python’ is not the impression TypeScript’s box of tricks has given me.

In my experience, if your logic is dynamic enough to need, for instance, conditional types, then it’s probably not worth trying to type it statically.

I’m concerned that these sorts of tools add more ways to do the same thing, in usually a worse way.

For instance,

@dataclass
class A:
    dogs: list[str]

@dataclass
class B:
    dogs: int

def f(y: A | B):
    ...

is the sort of function that might like to use conditional types in its implementation, but it should generally instead be written

@dataclass
class Base:
    dogs: list[str] | int

@dataclass
class A(Base):
    dogs: list[str]

@dataclass
class B(Base):
    dogs: int

def f(y: Base):
    ...

(and then you take a quick look at Base, realise that dogs are different things, and fix your dodgy structure.)

TypeScript code often ends up like this, defining abstractions as up of disparate classes, rather than using inheritance hierarchies, because it gives you a large tool-set for working with the former. However, it’s generally poor architecture, because (contravening the DIP), it makes the general case depend on the specifics, coupling them all together.

In TypeScript, that’s somewhat inevitable, because it has to work with (a) JavaScript’s funny ideas about Objects, and (b) the general tendency of JavaScript code to end up as a bit of a tangled mess, needing type system super-weapons to navigate. Python doesn’t have the same issues, and so I’d be cautious of adopting ideas from there.

Static typing is inherently conservative: there will always be programs that won’t type check, yet are logically correct. At some point, the expressiveness gained from new type system intricacies is not worth the complication they bring. The language might not be quite at that point yet, but I don’t think it’s far off, and I’d worry this proposal takes it over that line.

6 Likes

Unfortunately that is the case, yeah, at least as specified.

There is probably some wiggle room in the design space though if we really wanted to be able to support that, though. Possibly things declared in the class body could make it into Members with something indicating no type annotation?

We could also have an operator that gets inferred stuff and won’t work the same (at all?) at runtime, but I’d prefer to avoid that

1 Like

As much as I’d like to have more capabilities in the Python typing system to express complex/dynamic/derived types (like TypeScript does), the PEP feels really ambitious to me, and introduces many (around 20!) special forms (that would presumably be added to the typing module). It is trying to solve many current typing limitations, that presumably are independent from each other and could be solved in separate PEPs.

In most examples, the syntax looks quite complex to follow, mainly because it relies on the existing Python syntax (sub-scripting with [...]). If PEP 9999 – AST Format for Annotation Functions | peps.python.org [1] were to be tackled first, maybe it would allow for much simpler/idiomatic syntax?

I’d be happy to review the PEP in more details, just wanted to raise some concerns as I feel an extreme amount of edge cases will have to be answered.


  1. See also the related discussions on this forum. ↩︎

6 Likes

The scale of the change is also a bit scary to me, but we can also see it as a chance to make the type system radically more powerful. The PEP process is slow, and this lets us batch a lot of improvements at once.

To move this forward, you should make sure the PEP gets implemented in typing-extensions and at least one type checker so people can play with it and we can get a sense for the edge cases.

The PEP currently targets Python 3.15, which has a feature freeze in about two months. I think that’s unlikely to happen; there’s simply too much here to digest and implement in two months.

7 Likes

The PEP currently targets Python 3.15, which has a feature freeze in about two months. I think that’s unlikely to happen; there’s simply too much here to digest and implement in two months.

FWIW I think this can be decoupled from Python release cycle. We can add the types/APIs to typing_extensions asynchronously, and basically the PEP can be “shipped” when that’s on PyPI and the main typecheckers agree that that’s the way forward. @msullivan has a mypy implementation that’s pretty far along as a POC.

I don’t want to come across negatively on this, especially more than just the default as I’d appreciate the features proposed here, but looking at the scope and certain interactions within the existing unsound foundation of the type system, along with my experience with reconciling such inconsistencies in a smaller proposal, this seems overly ambitious to say is something we can extend the type system with currently.

In particular, the part about intentionally being less strict than type script due to other things we don’t currently have concerns me, as I find typescript to already be too permissive by default for my needs.

1 Like

I like the idea behind the two prerequisites: typevars in Unpack and extended callable syntax. I have wished for both previously.

For the extended callable syntax, my first impression is that it seems overly verbose, but maybe this could just be the initial way to represent the concepts before we introduce something like PEP 677 to have a nicer syntax.

For the typevars with TypedDict bounds, I personally would get more out of it if the KeyOf and GetMemberType operators that are mentioned as future work would be supported. I’d use them in something like this:

from typing import GetMemberType, KeyOf, TypedDict

class Series[T]: ...

class DataFrame[TD: BaseTypedDict]:
    def __getitem__[K: KeyOf[TD]](self, col: K) -> Series[GetMemberType[TD, K]]: ...

which would allow type annotating a DataFrame-like object.


I have to say though that I never encountered the need for the other typing concepts proposed here, though I admit I never used TypeScript so maybe I was simply not aware what I was missing out on.

This seems like something that could be published first as an extension to a type checker (presumably mypy) and gain some real world use before it gets committed into a standard?

Not going to lie, this is not a compelling example to me. I don’t want to see this in code, scroll past it, or implement a static evaluator in a type checker to calculate the result (that last one is probably the most valid feedback, and yes, my job used to involve implementing static evaluators in type analysis engines for Python :wink: ).

Perhaps what we really need is a better extension model for type checkers, so that libraries that need this level of complexity can provide something separate from their code but still in the repository/package and accessible?

4 Likes

This seems like something that could be published first as an extension to a type checker (presumably mypy) and gain some real world use before it gets committed into a standard?

You can’t get any real “real world” use out of this – sure, you might be able to check typing in CI with a plugin but then your IDE experience is all red underlines. You can’t ask users to participate in an experimental project “hey, here’s a new API I made, please use it with all these restrictions, I need to gather data for Python core devs”. This is just impractical and will not give you any actionable data.

Perhaps what we really need is a better extension model for type checkers, so that libraries that need this level of complexity can provide something separate from their code but still in the repository/package and accessible?

Typechecker extensions are 100% not the way to go. We will not arrive to a future where all typecheckers implement one plugin API, this just won’t happen (mypy is in Python, ty is in rust, pyright is TS), but let’s assume it did:

  • Now your library requires a plugin
  • As a library author your job is tough – things go wrong and you have to debug your code base AND your plugin. Not to mention the author will now have to maintain two disjointed code parts – the library and the plugin to work with it. So much friction.
  • As a library user your life is tough – things go wrong and it’s unlikely you will be able to debug and test the plugin
  • Plugins/extensions will not reduce the complexity, they will in fact greatly increase it for everyone involved (library authors, typechecker maintainers, users).
  • Going this path basically pronounces that we are giving some of the control over how Python language evolves to typechecker maintainers. Python will become severely tied to typecheckers capabilities and extension APIs. This will add friction and confusion.

Last but not least… typecheckers already have an “extension API” they all have to support – that the typing module! Let’s enhance the existing API instead of inventing a new super low level one.

Not going to lie, this is not a compelling example to me. I don’t want to see this in code, scroll past it, or implement a static evaluator in a type checker to calculate the result (that last one is probably the most valid feedback, and yes, my job used to involve implementing static evaluators in type analysis engines for Python :wink: ).

In all fairness it’s highly unlikely you’d see that code! You will likely see something like this:

def select[ModelT, K: typing.BaseTypedDict](
    typ: type[ModelT],
    /,
    **kwargs: Unpack[K],
) -> ComputeSelectReturnType[ModelT, K]:
    raise NotImplementedError

And if you’re curious how typing is plugged in here you’d ctrl+click onto the ComputeSelectReturnType type and learn. For ORMs specifically, the complexity of dealing with databases, generating SQL, generating migrations and reflecting tables to Python classes absolutely dwarfs the complexity of the little typing code we propose to add. I’m saying this as someone who spent their career in messing with dynamic code like this.


I’d like to steer this discussion to a place where we don’t judge this proposal from the position of feelings of complexity. Complexity is there, for sure, but in reality this PEP isn’t making an unimaginable leap of faith, instead it does: allow if/for in type expressions; add a number of primitives to the typing module. I’m not saying this is trivial, it’s far from it, but this is nowhere near the complexity of Python metaclass machinery or adding something like yield/async/await to the language. All I’m saying is that gut feelings and “I don’t want to see this in code” will not get us far. Not trying to pick on your comment Steve (I like you a lot!) but I already see this kind of sentiment in some discussions and I’d like to see less of it as it’s not actionable.

1 Like

I agree with not adding plugins, but I think the argument about the syntax concerns shouldn’t be out of scope here either.

If we need new syntax, or if we need to have special forms that take quoted input to allow typing constructs to not be bound by python syntax (much like TS types aren’t bound by JS syntax) to ensure good ergonomics, I think either of those are both going to be preferable to the syntax proposed here

“feelings of complexity” should be valid to judge proposals about when the goal here is improving developer ergonomics. If it feels like “syntax soup” as I’ve heard it called elsewhere, people are going to be averse to using it and it will artificially increase the barrier to learning it than if the syntax better reflected the type concepts rather than just being shoved into the syntax currently accepted in an annotation context.

There’s also significant actual complexity in the way this set of features here will interact with what is already in the type system, syntax aside. When experienced developers who have worked with types and type systems have “feelings of complexity”, it’s often just a mental heuristic of experience reflecting the fact that there are hidden dragons.

IMO I think using normal python functions to do these kinds of type transformations should be reconsidered as I think that’s definitely the best option. Yes, it is quite complex to implement, although I am unsure why it would be significantly more complex than the current proposal. And it would, with a bit of care, make the code readable in a way that the current proposal never will be.

This is e.g. how the select example could look.

def compute_select_return_type(BaseModel, UsedMembers: typing.BaseTypedDict):
    result = typing.NewProtocol()
    for key in typing.Attrs(UsedMembers):
        member_type = typing.GetMemberType(BaseModel, key.name)
        field_type = typing.ConvertField(member_type)
        result.add_member(key.name, field_type)
    return result

def select[ModelT, K: typing.BaseTypedDict](
    typ: type[ModelT],
    /,
    **kwargs: Unpack[K],
) -> list[compute_select_return_type(ModelT, K)];

This could look even better & pythonic if some of the special forms/functions were renamed and if typing. would be dropped, but I think it’s already easier to understand what is going on.


My opinion on this is that the mini language currently in place is well suited for declarative operations - but “if/for” as used here is definitely imperative, and I think this difference is enough to consider moving out of the mini language.

IMO:

  • Complexity for type checking implementers is less important than
  • Complexity for people writing type expression is less important than
  • Complexity for people using a typed library.

I agree that this proposal successfully reduces complexity for the last group, but I believe that we can reduce the complexity for the second group significantly by moving some of it into the first group.

Python gets lots of praise for readability. A core part of this is the indentation-based syntax, forcing separation of ideas onto separate lines. I think it would be a good idea to add some of this readability to typing instead of adding more and more special cases into the inline syntax. At some point it’s going to break. I don’t think this PEP is quite it, but I think this is a good place to jump off and do something better long term.

If we need new syntax, or if we need to have special forms that take quoted input to allow typing constructs to not be bound by python syntax (much like TS types aren’t bound by JS syntax) to ensure good ergonomics, I think either of those are both going to be preferable to the syntax proposed here

Keep in mind, that unlike TypeScript, annotations in Python must be runnable at runtime for frameworks like Pydantic, FastAPI, and others to make sense of them. This adds a big constraint on any new syntax we add to annotations or anything else, really.

This would require typechekers to implement Python runtime in them too. And it’s highly non-trivial. This shifts the complexity and makes it someone else’s problem, which in reality will mean that we’re just not solving this problem at all and stalling.

I’m well aware. I should be clear I don’t meant quoting invalid syntax, I mean something more like:

SomeForm["Something that wouldn't be python syntax and is only for the benefit of types"]

this would evaluate just fine at runtime, though anything of this nature would likely also neccisitate that the form accepting special quoted syntax like this has a method on the SpecialForm to canonically parse it into something structured for uniform use by runtime tools.

I do wish we had kept the original annotation feature future and had all annotations just be strings, as this also would have allowed a better path for things like this. We could have had a function in typing that just parses any string as types rather than as a python expression.

It would require implementing some parts of python’s runtime. As does the current proposal. Yes, it would be a larger part but I think we can define a reasonable subset and expand it later if needs arise.

Type checkers already need to implement parsing and scoping rules as well has having a decent understanding of what each statement does.

True, which I explicitly stated.

I think adding a bad solution to a problem is worse that not adding a solution. I am not sure if the current proposal is bad enough for this apply, but I do think using normal python functions.

Just noting: typically we refrain from stamping proposals this way in our discussions. I think @msullivan will reply to your comment in more detail. But please be more careful with how you express your personal opinion here.

This discussion, the related draft PEP and similar draft PEP I had are expressing a similar idea.