A more useful and less divisive future for typing?

One of the reasons I don’t report such annoyances to VS Code is that even though it’s open source, my experience with their tracker is that it feels very much like a more commercial bug database[1] - discussions on principles, or “it would be nice if” feature requests tend not to get very far, or get bogged down in questions I can’t answer.

Also, this technically isn’t a VS Code issue, it’s something to do with the Python extension, but I don’t know if it’s the actual extension, or pylance (if that’s different), or the Python language server (if that’s even still a thing) or what. I don’t have enough motivation to spend the time needed to disentangle the web of inter-related projects involved. And as I’m not a Javascript/Typescript developer, I don’t know where to start looking at code for ideas as to what’s going on.

  1. This isn’t a complaint, I suspect it’s more of a necessity when dealing with a project of that size and nature. ↩︎

1 Like

To my understanding, VS Code is open source, pyright (the Microsoft typechecker) is open source, but pylance (the VS Code extension bridging VS Code and pyright) is NOT open source. Additionally, it’s somewhat difficult to know which of these components is actually responsible for what functionality.


So, you may find it interesting that something that came up from the perspective of “Correctness” in the intersections discussions that have been going on would in fact have made this easier.

One of the ways that came up during the (still ongoing) discussions about adding intersections to python, and correctness was to add a way to mark that something wasn’t to be treated as a type, but as if it were a protocol, but without having to do all the legwork of a protocol yourself, as well as adding default protocols for all of the builtin dunders in the data model. Only the latter would be needed for this.

This would result in list[SupportsDivmod] without defining any of those yourself. If you needed more than div mod, it might look like: list[SupportsDivMod & SupportsAdd] Yes, this is still verbose, but all of it would be provided for you and it is very clear about what it is, and what something needs to be considered supported. If you wanted to support more than lists, Sequence[...] or (with the first of the two related ideas) AsProtocol[list[...]] would each work under this idea.

You also wouldn’t need to write as many protocols yourself.

# defining and documenting a type alias here:

#: A collection of methods we need
SupportsBasicMath = SupportsAdd & SupportsSub & SupportsDiv & SupportsMul & SupportsTruediv

def fn(x: SupportsBasicMath, y: SupportsBasicMath) -> SupportsBasicMath:

Many of the people who want stronger typing are very aware that type hints are not obvious and require a lot of boilerplate from users and especially library authors currently, and want to help get them to where doing the right thing is easy. I count myself in this camp, and my experience with typing in other languages leads me to think that it’s extremely possible for both a notion of stronger soundness and better ergonomics to happen and that they actually support each other as goals.

I’m editing in another example for the “AsProtocol” bit here, note this doesn’t exist, it’s something that came up as making correctness easier while discussing the also not yet accepted Intersections.

class MyClass:
    """ Has a bunch of methods and properties """

def my_fn(x: AsProtocol[MyClass]):
    """ duck-typing works on this, it just needs to quack like MyClass
    without defining the exact same interfaces in a protocol too, including if I later
    add a optional speedups extra that pull in a native
    implementation that I don't want to make mandatory
    due to a reliance on a rust compiler that may not be
    available on as many platforms """ 

For me, “doing the right thing is easy” means inferring a protocol based on usage. So, for example,

def f(x):
    x = x + 1

should infer the type of x to be “something that supports the + operator, returning a value that can be printed”. Yes, this is basically duck typing. That’s the principle that Python was built on, and IMO it should be the starting point here. I accept that it’s not always feasible to do this much inference in the face of how dynamic Python can be, but it should at least be a target to aim for.

And even if it’s not viable in some cases to infer an suitable type for x meaningfully, working from the “other direction” and ensuring that the inferred types of everything that gets passed to f support + and print would still be useful.


This looks to be possible in theory, but not possible without more rigor in the type system to me.

Edit: and for the record, the proofs for this being possible were not published until after the initial work on python’s type system, so nobody here should be going “why couldn’t we do this from the start then?”

Here’s the biggest place I see an issue with it:

What happens if (given that python is dynamic, and that there is no compiler seeing everything and determining types here, then spitting out a program that doesn’t change) print is replaced? should type checkers assume that any “dynamic” replacements like this either will only happen compatibly or aren’t their problem? (for the record, my answer to this would be “not the type checker’s problem, infer away”)

If so, then we still need more rigor in the type system than we have now. This sort of duck-typed inference for untyped code is provably only safe with either the set-theoretic model of typing, a model of typing that doesn’t have nominal subtyping, or a model of typing that is allowed to error if a type isn’t determinable. As Python has nominal subtyping (subclasses exist and are supposed to be safe replacements), we would need to adopt a set-theoretic model of typing officially for this to be a safe inference or just accept that this still wouldn’t work some of the time. I think accepting a set-theoretic model would be preferable to just shrugging and telling users “idk, it’s not determinable”. (as discussed in the Intersections stuff, the set-theoretic model is already largely compatible with the status quo, but may require minor definition (but not behavior) changes to Any, Never, and Union)

If we did, this would allow untyped code to instead of being treated as Any, be treated as an intersection of protocols that are automatically determined by use, and only error out if those intersections detected such a type to be impossible (due to conflicting usages) (or result in Never, as an empty set, but this may be less useful and more inscrutable)

It would also require type checkers to then support this, but more academic rigor in the type system here actually opens the door to that as an option as well.


To be clear, I don’t think it’s practical to do this (at least not in any realistic sense). But I think it’s important to remember when thinking about making the right behaviour easy, that inferring everything so the programmer can essentially ignore any need to declare types is the ideal here. That’s what strongly typed languages like Haskell and Rust[1] aspire to, and I don’t think it should be any different for Python - we may not be able to get as close to that ideal in a dynamic language, but that’s OK.

Let me ask that question in reverse. If I’m fine with not worrying about print being replaced, what type annotation should I write to say “I want x to be anything that can be output with the stdlib print statement”? That would be more practically useful than worrying about making it easier to express “must be an arbitrary sequence of objects with the following 3 methods”.

Or taking the divmod example again - how do I say that my function accepts two arguments that I can validly pass to the builtin divmod without having to copy the monstrosity that’s the builtin divmod’s type annotation?

  1. I’m less sure about Rust, which seems to lean more heavily on generics than on inference, but the net result is similar - you can often just “let the compiler get on with it”. ↩︎


Ideally? The lack of type annotation should result in exactly the behavior you want here and more rigor would enable this without significantly breaking what already exists. Explicit typing would then only be needed on base primitives, and to avoid issues of Hyrum’s law or the interface constantly changing on an exported (public) interface in a library, but the inference there could then be copied by the library author or just used as a starting point. Currently, you’d need a single protocol that defines all the behavior you need. In a future with intersections accepted, you could at least intersect the existing protocols.

If we take that intersection will eventually be accepted (Which I would not take as a given at this point in time), then eventually you could write this as

# covers x = x+1
class AddableReturningSelf(Protocol):  
    def __add__(self: Self, other: int) -> Self:   ...

# could exist in typing given discussions 
class SupportsStr:  
    def __str__(self): -> str:   ...

f(x: AddableReturningSelf & SupportsStr):


class WorksForF(Protocol):  
    def __add__(self: Self, other: int) -> Self:   ...
    def __str__(self): -> str:   ...

def f(x: WorksForF):

The current solution may look more concise than the intersection one, but it isn’t composable and some of it is included when it might exist in typing instead

This isn’t actually fully correctly typable currently while remaining open to all possible implementations of __divmod__ and __rdivmod__, even the current monstrosity referenced actually leaves out some possibilities.

Might I ask why you don’t think this is practical? If more rigor can (and it can) result in better untyped inference, shouldn’t it meet both the goals of those who want more rigor from typing and those who want to interact with typing less?

In the case of “adopt set theoretic typing, infer untyped from use of typed around it” you get the exact minimum bound of type provided by anything compatible with your use and do not have to provide any additional types, and the type system itself largely was unchanged, with only more strong definitions that largely match existing ones accepted, allowing for such inference (if a type checker then does it)

The divmod one actually results in the type being correctly typable (it isn’t currently, there’s no way to pass a type’s implementation of this forward fully without pre-supposing the implementation, and a return type on divmod is not limited by the datamodel to returning the same type)

Because “better untyped inference” isn’t the same as “100% inference” and more rigour usually (in my experience) means “needing to understand more advanced concepts”. My (very limited) experience with Haskell suggests that when the type system can infer, things are great. But when it can’t, your brain explodes typing to help it. After all, we should assume that the computer is better at type theory than the average developer :slight_smile:

If more rigour can be achieved without making the underlying theory more complex (and I’m thinking of an audience for whom ideas like covariance and intersection are already hard), then maybe I’m wrong. I was after all only stating an opinion, I don’t have any specific evidence to back it up.


I doubt that the hypothetical ideal type checker would be fast enough if it needs to resolve all untyped variables - in all code paths - for all imports - all dependencies.

I wonder what the main underlying disagreements here are? Or what the values are that people want to achieve - ease for human devs, ease of code maintenance, safety/robustness of programs, provable correctness? I don’t believe those values – each of them valid concerns – can ever be totally reconciled.

I believe that the main value of type-hints (also the original pitch for having them at all - dating back to the BDFL’s post from 2004 (!) Adding Optional Static Typing to Python) was/is to make programming easier and make it easier to catch bugs for developers. Given the dichotomy in Python between static type-checking and runtime – where type-hints are ignored – I also don’t believe you can expect much more value from type-hints.

I think the underlying theory must be more complex, but not what python developers need to understand. I believe for anyone other than authors of type checkers, this can and should be done in a way that results in less needed knowledge.

Those developing type checkers would need to be familiar with 3 additional concepts over what they currently need, all encompassed in a single field, and all having strong parallels to existing required knowledge. Namely, Those 3 being:

The subtyping relation as it applies to type narrowing has a more specific definition that works well with sets of types.
Type specifications are sets of types
The top and bottom types are sets of sets of types.

This being overall easier for most python developers by the addition of rigor would largely only be true because we have structural subtyping available already, and do not have higher kinded types (haskell, your comparison of choice here, having higher kinded types significantly expands the knowledge needed of developers when inference does not work). I would be extremely hesitant to accept higher kinded types into the type system without first showing that user-generated error messages can explain what exact information is missing for the type checker to make a decision, and current theory suggests (but has not proven either way) that this isn’t a “decidable problem”.

These can and should all be reconcilable, but there’s no point in even attempting it currently because it requires changing fundamental things in the type system that weren’t rigorously defined to begin with. It creates more problems than it solves to just experiment here, but to use an example that was already in the thread, Under the set-theoretic model, here’s what would happen in terms of inference:

def f(x):
    x = x + 1
    # x is of an unknown type and it's the first time we're using it.
    # It needs to be compatible with addition 
    # while the returned type of the addition needs to be compatible 
    # with itself. This makes the initial bound a protocol with this behavior
    # x needs to support being printed. 
    # This means having `__str__` defined, so add that in.
# x is out of scope, so we now know all the 
# behaviors it needs in that scope and can apply that to f

You might notice this is identical to:

# covers x = x+1
class AddableReturningSelf(Protocol):  
    def __add__(self: Self, other: int) -> Self:   ...

# could exist in typing given discussions 
class SupportsStr:  
    def __str__(self): -> str:   ...

f(x: AddableReturningSelf & SupportsStr):

In what I described the type checker needs to do, and importantly it already needs to be able to do all of those steps except collecting the behaviors and intersecting them to apply later. It already needs to understand what assignment does, what the + operator does, and what the types of called functions are.

For completeness, there’s actually another case here that changes it slightly that’s worth comparing as well

def f(x):

In this case, we aren’t assigning the name again so we could infer one fewer behavioral bound on x. This needs to be differentiated so that the rules remain consistent in more complex cases, such as one where x was returned after addition and printing.

This changes the minimum bound to something closer to:

class SupportsStr:  
    def __str__(self): -> str:   ...

class AddableReturningPrintable(Protocol):  
    def __add__(self: Self, other: int) -> Self:   SupportsStr

f(x: AddableReturningPrintable):

Note that this one is composed differently, and neither case covers __radd__ potential with 1, even though this probably should be better expressable as well. I think this being a hole in what’s correctly expressible should be addressed as well, but doing so before figuring out what the primary goals of typing should be going forward would not have productive outcomes.

There’s a huge difference between application developers and library developers here that we’re ignoring.

An application developer essentially wants to tightly constrain types:

  • This variable x will only ever be of type AppWidget.
  • This function is designed to take AppWidget objects and return their size, which is always an integer.

The application domain rules add a lot of certainty that allow clearly specified types, and “but what if someone passes something similar, but not quite the same” is easily answered - “that’s a bug”.

A library author, on the other hand, wants to define types as broadly as possible. They want to allow users to pass weird and wonderful types, because they don’t know the application context. Duck typing, and high levels of dynamism, are key tools for a library author. Even something as simple as defining a function as taking a str can prohibit a use case that might be important to someone - or at least require the user to add unnecessary boilerplate to implement a bunch of “don’t care” methods that neither the user, nor the library code, actually need.

I’m coming at this from a library author perspective. So for me, as soon as I say "this parameter is a Sequence[int] I’ll get someone who wants to pass a Collection (I actually don’t use reversal) of their user defined integer class. And when I fix that to Collection[Integer] it turns out that they don’t implement attributes like numerator in their class, because their class isn’t a fraction, and they don’t care. And I don’t treat the integers I get passed as fractions, so nor do I. So what I actually want is a much more complex type that expresses “Collection of something that’s like an Integer but I don’t care if it doesn’t implement the Rational API”. And that’s when we get those terrible, complicated, unreadable types that really just mean “as long as it works like what I expect, I’m fine with supporting it”…

So sure, application developers don’t need to understand the complexities of type theory. The trick is to ensure that library developers don’t, either - and that’s the bit I think is too hard to be practical.


I was considering both :slightly_frowning_face: .

The automatic inference I’ve described above only applies the minimum bound based on observed interaction with other things. It is as permissive as is observed.

Because of the way it is composed by observed behaviors, it would be entirely reasonable and not additional work (The tooling already needs to know all of this) for tooling to also show which behavior created which component if requested.

considering this case again:

def f(x):
    x = x + 1

Not only is such diagnostic info possible, but it’s already how it would be internally composed.

# The type of x is:
(Protocol)__add__(T, int) -> T # from x = x+1
& (Protocol)__str__(T) -> str  # from print(x) 

(Note: I do think that there should also exist a way to “prettify” this, but the lack of type doesn’t insert this there except in the innards of the tool. Maybe such tooling should also generate this with a docstring for your use and ability to name, but that’s even further into “make it more usable” depending on buying into the initial bit that most use-based typing is in fact something we could do better at here)

Meanwhile, if an application author wants to take this and make it more restrictive, the more restrictive option here is already obvious with entry-level knowledge, x: int. It’s then no longer being inferred, and a type checker can see that the more limited “just an int” does work for how it is used.

Edit: What I’m saying here is that if we take the more rigorous approach and say that things should be detected to the best ability under that additional rigor, the default case goes from “untyped code isn’t checked” to “untyped code is still checked that all of its use is self-consistent, and typed code may use it as long as the typed things passed in are consistent with what it is passed to”. To your complex case involving a Collection and a complex protocol, you would never have typed it if the goal was allowing duck typing, it would have just worked because the widest allowable interface would have been detected for you as the type you accept and you wouldn’t have been led to list, and then sequence, and then collection, and then a protocol in that collection.


So. . . is there anything that Python needs to do here? What you said here sounds to me like “someone could write a different type checker and/or a different set of types (to be used in annotations) that would work better”. Is there some kind of support needed from the Python-language side that isn’t currently there?

This future sounds like a much better correspondence between Python-the-language and Python-the-type-system. If I am understanding you correctly, intersection inference like this reflects Python’s actual runtime duck typing by identifying what attributes are actually used (the “quack”). So with no annotations at all, a verifier’s default behavior could be to check for compatible usage of these inferred protocols. It would be possible to only add annotations for API documentation (like giving a protocol a more meaningful name), to resolve ambiguity, or to restrict a type to something narrower than what is inferred.

This sounds excellent and makes me optimistic.

Formalizing the parts of the type system that aren’t, explicitly choosing the set-theoretic model of subtyping (notably, this is entirely compatible with the current subtyping behavior that type checkers use, but is not specified, but there are other definitions which could be chosen that would have different outcomes), clarification on the definition of “Any” in python typing, as the ability to subclass it that was added recently isn’t well defined, and possibly either an explicit auto or Unknown if we don’t want the behavior of untyped code to change for existing type checkers, or something accepting that type checkers may rely on inference for more than they do. If auto/Unknown are wanted, this would make the kind of inference I’m describing opt-in, but still relatively easy.

I’m optimistic that it’s possible and can describe a path that could get us to it, but I don’t think it’s an easy path, and it would require existing type-checker buy-in to not fragment the typing ecosystem. If more is “just detected” and some type-checkers aren’t buying into this, it’s going to lead to conflict between libraries that would prefer to use a type-checker that does this and the ones that would stick with the status quo. Maybe that’s okay long term and the ecosystem would self-resolve to what is more ergonomic and useful, but I don’t think it would be good for end users of the libraries on differing type-checkers in the short term without this actually being an intent to be supported by the type system.

Both of these points also feed into the desire for formalization that’s been expressed here, and for end-user stability in typing here, that is, even if this is something that would be a large improvement for most users, we can’t just up and do it either.

Alternatively, you could say that a proof of concept implementation that assumed the changes to the typing system which are needed, would make an excellent demonstration of the power of this alternative approach. I imagine it would be much easier to get buy in from the typing community and the existing type checkers if there was working code that showed the benefits, as opposed to simply trying to sell the community on a potentially useful, but somewhat theoretical-sounding proposal.

Of course, I don’t have the first clue how much effort would be involved in writing such a proof of concept, so maybe this is an impractical suggestion.


I would struggle to give an accurate estimate here, but my attempt is that this would be 16x40hr weeks for me specifically as a single developer and that it would not be work easily dividable such that it decreases linearly with more developers. Anyone else working on this would need a prior understanding of type theory or add in time to study what they are missing.

It’s not impossible for someone to do it more quickly if they had a good enough understanding of an existing type-checker (for python or otherwise) that they knew enough to know where to tweak it to get the desired behavior, but I don’t have that level of deep understanding of the code base of existing type-checkers.

Eric has mentioned elsewhere that pyright already constructs a virtual intersection internally in some cases, so if someone were to go this route, it would seem pyright might already have some of the mechanisms required for this, and that it would just be about leveraging it in more places + handling creating inferrable protocols, but I’m not sure.

All of that is to say, it wouldn’t be a small undertaking and I don’t know if I’d be able to put that much of my own free time into this.

I don’t think this would be the only potential benefit of formalization and intentionally choosing self-consistency in the type system though, but this is an example where self-consistency as a goal could enable more long-term. I think we can appropriately move more of the complexity of Python’s typing into tooling and away from most library authors, application developers, and end-users. I’ve seen many people argue that making it easier and being more formally correct are opposing goals, but I think many are predicting these being opposing goals on that being more formal means users need to be the ones more formal, rather than that tooling could use the consistency in the system to know more at that level.

I’ll try and think about this more later to see if there’s either an easier path to show that these are being incorrectly assumed as opposing goals or if there’s a way I could work on this that would require less time.


But what form would that take? Are you envisioning changes to what is provided in the typing module? Or just documentation?

Perhaps I’m demonstrating my ignorance here, but my understanding is that typing as provided by Python (not mypy or other third-party tools) is essentially nothing but those two things: the typing stdlib module and documentation (in which I include standards such as those promulgated as PEPs). What I’m not understanding is where what you describe fits into that. Are you envisioning a PEP that says something like “we hereby adopt the set-theoretic model of subtyping”?

I’m not sure I see how that’s avoidable anyway. Even if some declaration were made, it can’t change what people are already doing with types. And it seems unlikely that existing tools would suddenly switch to a new way of doing things. So there will always be some period of coexistence of new and old ways.

This isn’t to say that what you’re describing is impossible, but just that I don’t see how the way to bring it about would be different from the way similar changes happen in other parts of the Python stdlib: someone makes a library called better_typing.py that defines some new set of types and/or type analysis tools, and someone writes a better typechecker that uses that system, and then eventually everyone realizes it’s so good it gets brought into the stdlib (or the existing typing tools are adapted to mimic it). To me, what you’re saying sounds more like you want some kind of official declaration that everyone should henceforth do typing a certain way. Am I misreading you?

So would the same amount of effort be needed if your desiderata above were realized? Would the 640 person-hours of work you describe be in effect a “down payment” on the work required to bring the system to full fruition (as opposed to being “wasted” on a proof of concept that can’t be fleshed out into a full-featured tool)? That sounds like a good deal of work! And if that amount of work is required just for a proof of concept, it seems like the benefits of the changes would need to be quite large in order to make it worth it to implement the full system.

Plus, personally, I think I’d be a little leery about a proposed change to Python itself at a stage where things are still 640 person-hours away from even being able to demonstrate the benefits. That sounds to me, again, like a case where the normal “put something on PyPI first” approach is more warranted — which is slow, but thereby allows things to proceed on a parallel track without impacting core Python functionality (or docs) until the gains are clearly established. Are there any existing Python typing tools that give us a glimpse of what you’re saying would be this better future?

1 Like

+1 on that.

I’d specifically disagree with Michael’s assertion about “Some preople …, and people who …”.
I clearly belong to 3rd or 4th class of Python users: I use type hints as hints. Python typing is whatever Python object model allows, and folks, myself included, traditionally follow duck typing in their code. I’m happy to add type hints to my code, but that doesn’t mean that I want things to be stricter. ¯_(ツ)_/¯