A more useful and less divisive future for typing?

These can and should all be reconcilable, but there’s no point in even attempting it currently because it requires changing fundamental things in the type system that weren’t rigorously defined to begin with. It creates more problems than it solves to just experiment here, but to use an example that was already in the thread, Under the set-theoretic model, here’s what would happen in terms of inference:

def f(x):
    x = x + 1
    # x is of an unknown type and it's the first time we're using it.
    # It needs to be compatible with addition 
    # while the returned type of the addition needs to be compatible 
    # with itself. This makes the initial bound a protocol with this behavior
    # x needs to support being printed. 
    # This means having `__str__` defined, so add that in.
# x is out of scope, so we now know all the 
# behaviors it needs in that scope and can apply that to f

You might notice this is identical to:

# covers x = x+1
class AddableReturningSelf(Protocol):  
    def __add__(self: Self, other: int) -> Self:   ...

# could exist in typing given discussions 
class SupportsStr:  
    def __str__(self): -> str:   ...

f(x: AddableReturningSelf & SupportsStr):

In what I described the type checker needs to do, and importantly it already needs to be able to do all of those steps except collecting the behaviors and intersecting them to apply later. It already needs to understand what assignment does, what the + operator does, and what the types of called functions are.

For completeness, there’s actually another case here that changes it slightly that’s worth comparing as well

def f(x):

In this case, we aren’t assigning the name again so we could infer one fewer behavioral bound on x. This needs to be differentiated so that the rules remain consistent in more complex cases, such as one where x was returned after addition and printing.

This changes the minimum bound to something closer to:

class SupportsStr:  
    def __str__(self): -> str:   ...

class AddableReturningPrintable(Protocol):  
    def __add__(self: Self, other: int) -> Self:   SupportsStr

f(x: AddableReturningPrintable):

Note that this one is composed differently, and neither case covers __radd__ potential with 1, even though this probably should be better expressable as well. I think this being a hole in what’s correctly expressible should be addressed as well, but doing so before figuring out what the primary goals of typing should be going forward would not have productive outcomes.

There’s a huge difference between application developers and library developers here that we’re ignoring.

An application developer essentially wants to tightly constrain types:

  • This variable x will only ever be of type AppWidget.
  • This function is designed to take AppWidget objects and return their size, which is always an integer.

The application domain rules add a lot of certainty that allow clearly specified types, and “but what if someone passes something similar, but not quite the same” is easily answered - “that’s a bug”.

A library author, on the other hand, wants to define types as broadly as possible. They want to allow users to pass weird and wonderful types, because they don’t know the application context. Duck typing, and high levels of dynamism, are key tools for a library author. Even something as simple as defining a function as taking a str can prohibit a use case that might be important to someone - or at least require the user to add unnecessary boilerplate to implement a bunch of “don’t care” methods that neither the user, nor the library code, actually need.

I’m coming at this from a library author perspective. So for me, as soon as I say "this parameter is a Sequence[int] I’ll get someone who wants to pass a Collection (I actually don’t use reversal) of their user defined integer class. And when I fix that to Collection[Integer] it turns out that they don’t implement attributes like numerator in their class, because their class isn’t a fraction, and they don’t care. And I don’t treat the integers I get passed as fractions, so nor do I. So what I actually want is a much more complex type that expresses “Collection of something that’s like an Integer but I don’t care if it doesn’t implement the Rational API”. And that’s when we get those terrible, complicated, unreadable types that really just mean “as long as it works like what I expect, I’m fine with supporting it”…

So sure, application developers don’t need to understand the complexities of type theory. The trick is to ensure that library developers don’t, either - and that’s the bit I think is too hard to be practical.


I was considering both :slightly_frowning_face: .

The automatic inference I’ve described above only applies the minimum bound based on observed interaction with other things. It is as permissive as is observed.

Because of the way it is composed by observed behaviors, it would be entirely reasonable and not additional work (The tooling already needs to know all of this) for tooling to also show which behavior created which component if requested.

considering this case again:

def f(x):
    x = x + 1

Not only is such diagnostic info possible, but it’s already how it would be internally composed.

# The type of x is:
(Protocol)__add__(T, int) -> T # from x = x+1
& (Protocol)__str__(T) -> str  # from print(x) 

(Note: I do think that there should also exist a way to “prettify” this, but the lack of type doesn’t insert this there except in the innards of the tool. Maybe such tooling should also generate this with a docstring for your use and ability to name, but that’s even further into “make it more usable” depending on buying into the initial bit that most use-based typing is in fact something we could do better at here)

Meanwhile, if an application author wants to take this and make it more restrictive, the more restrictive option here is already obvious with entry-level knowledge, x: int. It’s then no longer being inferred, and a type checker can see that the more limited “just an int” does work for how it is used.

Edit: What I’m saying here is that if we take the more rigorous approach and say that things should be detected to the best ability under that additional rigor, the default case goes from “untyped code isn’t checked” to “untyped code is still checked that all of its use is self-consistent, and typed code may use it as long as the typed things passed in are consistent with what it is passed to”. To your complex case involving a Collection and a complex protocol, you would never have typed it if the goal was allowing duck typing, it would have just worked because the widest allowable interface would have been detected for you as the type you accept and you wouldn’t have been led to list, and then sequence, and then collection, and then a protocol in that collection.


So. . . is there anything that Python needs to do here? What you said here sounds to me like “someone could write a different type checker and/or a different set of types (to be used in annotations) that would work better”. Is there some kind of support needed from the Python-language side that isn’t currently there?

This future sounds like a much better correspondence between Python-the-language and Python-the-type-system. If I am understanding you correctly, intersection inference like this reflects Python’s actual runtime duck typing by identifying what attributes are actually used (the “quack”). So with no annotations at all, a verifier’s default behavior could be to check for compatible usage of these inferred protocols. It would be possible to only add annotations for API documentation (like giving a protocol a more meaningful name), to resolve ambiguity, or to restrict a type to something narrower than what is inferred.

This sounds excellent and makes me optimistic.

Formalizing the parts of the type system that aren’t, explicitly choosing the set-theoretic model of subtyping (notably, this is entirely compatible with the current subtyping behavior that type checkers use, but is not specified, but there are other definitions which could be chosen that would have different outcomes), clarification on the definition of “Any” in python typing, as the ability to subclass it that was added recently isn’t well defined, and possibly either an explicit auto or Unknown if we don’t want the behavior of untyped code to change for existing type checkers, or something accepting that type checkers may rely on inference for more than they do. If auto/Unknown are wanted, this would make the kind of inference I’m describing opt-in, but still relatively easy.

I’m optimistic that it’s possible and can describe a path that could get us to it, but I don’t think it’s an easy path, and it would require existing type-checker buy-in to not fragment the typing ecosystem. If more is “just detected” and some type-checkers aren’t buying into this, it’s going to lead to conflict between libraries that would prefer to use a type-checker that does this and the ones that would stick with the status quo. Maybe that’s okay long term and the ecosystem would self-resolve to what is more ergonomic and useful, but I don’t think it would be good for end users of the libraries on differing type-checkers in the short term without this actually being an intent to be supported by the type system.

Both of these points also feed into the desire for formalization that’s been expressed here, and for end-user stability in typing here, that is, even if this is something that would be a large improvement for most users, we can’t just up and do it either.

Alternatively, you could say that a proof of concept implementation that assumed the changes to the typing system which are needed, would make an excellent demonstration of the power of this alternative approach. I imagine it would be much easier to get buy in from the typing community and the existing type checkers if there was working code that showed the benefits, as opposed to simply trying to sell the community on a potentially useful, but somewhat theoretical-sounding proposal.

Of course, I don’t have the first clue how much effort would be involved in writing such a proof of concept, so maybe this is an impractical suggestion.


I would struggle to give an accurate estimate here, but my attempt is that this would be 16x40hr weeks for me specifically as a single developer and that it would not be work easily dividable such that it decreases linearly with more developers. Anyone else working on this would need a prior understanding of type theory or add in time to study what they are missing.

It’s not impossible for someone to do it more quickly if they had a good enough understanding of an existing type-checker (for python or otherwise) that they knew enough to know where to tweak it to get the desired behavior, but I don’t have that level of deep understanding of the code base of existing type-checkers.

Eric has mentioned elsewhere that pyright already constructs a virtual intersection internally in some cases, so if someone were to go this route, it would seem pyright might already have some of the mechanisms required for this, and that it would just be about leveraging it in more places + handling creating inferrable protocols, but I’m not sure.

All of that is to say, it wouldn’t be a small undertaking and I don’t know if I’d be able to put that much of my own free time into this.

I don’t think this would be the only potential benefit of formalization and intentionally choosing self-consistency in the type system though, but this is an example where self-consistency as a goal could enable more long-term. I think we can appropriately move more of the complexity of Python’s typing into tooling and away from most library authors, application developers, and end-users. I’ve seen many people argue that making it easier and being more formally correct are opposing goals, but I think many are predicting these being opposing goals on that being more formal means users need to be the ones more formal, rather than that tooling could use the consistency in the system to know more at that level.

I’ll try and think about this more later to see if there’s either an easier path to show that these are being incorrectly assumed as opposing goals or if there’s a way I could work on this that would require less time.


But what form would that take? Are you envisioning changes to what is provided in the typing module? Or just documentation?

Perhaps I’m demonstrating my ignorance here, but my understanding is that typing as provided by Python (not mypy or other third-party tools) is essentially nothing but those two things: the typing stdlib module and documentation (in which I include standards such as those promulgated as PEPs). What I’m not understanding is where what you describe fits into that. Are you envisioning a PEP that says something like “we hereby adopt the set-theoretic model of subtyping”?

I’m not sure I see how that’s avoidable anyway. Even if some declaration were made, it can’t change what people are already doing with types. And it seems unlikely that existing tools would suddenly switch to a new way of doing things. So there will always be some period of coexistence of new and old ways.

This isn’t to say that what you’re describing is impossible, but just that I don’t see how the way to bring it about would be different from the way similar changes happen in other parts of the Python stdlib: someone makes a library called better_typing.py that defines some new set of types and/or type analysis tools, and someone writes a better typechecker that uses that system, and then eventually everyone realizes it’s so good it gets brought into the stdlib (or the existing typing tools are adapted to mimic it). To me, what you’re saying sounds more like you want some kind of official declaration that everyone should henceforth do typing a certain way. Am I misreading you?

So would the same amount of effort be needed if your desiderata above were realized? Would the 640 person-hours of work you describe be in effect a “down payment” on the work required to bring the system to full fruition (as opposed to being “wasted” on a proof of concept that can’t be fleshed out into a full-featured tool)? That sounds like a good deal of work! And if that amount of work is required just for a proof of concept, it seems like the benefits of the changes would need to be quite large in order to make it worth it to implement the full system.

Plus, personally, I think I’d be a little leery about a proposed change to Python itself at a stage where things are still 640 person-hours away from even being able to demonstrate the benefits. That sounds to me, again, like a case where the normal “put something on PyPI first” approach is more warranted — which is slow, but thereby allows things to proceed on a parallel track without impacting core Python functionality (or docs) until the gains are clearly established. Are there any existing Python typing tools that give us a glimpse of what you’re saying would be this better future?

1 Like

+1 on that.

I’d specifically disagree with Michael’s assertion about “Some preople …, and people who …”.
I clearly belong to 3rd or 4th class of Python users: I use type hints as hints. Python typing is whatever Python object model allows, and folks, myself included, traditionally follow duck typing in their code. I’m happy to add type hints to my code, but that doesn’t mean that I want things to be stricter. ¯_(ツ)_/¯


Specification (language level, currently just a collection of PEPs) describes the type system, type checkers (various) then choose what to do with them. There’s a bit more on how the standards haven’t actually spelled out everything that’s important or relevant to a type system, and good reasons to not have diverging behavior here, at least not largely diverging.

No, Anything I wrote here wouldn’t necessarily translate into the codebases of existing type-checkers.

Well, the changes I’d need are small and non-breaking, and only describe the type system, not runtime behavior. There’s no change to CPython needed, only to the specification of typing behavior. These changes are only to make one behavior specifically better defined, and specifically allow more from the type system.

  1. Define subtyping for Python using the set-theoretic model. This model is 1:1 compatible with how type checkers currently treat subtyping but also covers cases the type system does not yet consider, such as Intersections. (Non-breaking, but gives a stronger foundation for future work)

  2. State that type checkers are allowed (not required) to make complex type inferences based on consistency of usage even in untyped code and narrow Any based on observed use.

  3. Type checkers may (not must) detect incompatible use of Any due to inconsistent use.

For people not using a type checker, nothing changes. For people using a type-checker, they may get more errors detected if the type-checker they use can suddenly detect more issues. It would be on type checkers to continue finding the balance spoken about with useful errors vs false positives spoken about above :smile:

Those are the only changes needed here for this, and I think that these two being specified is a good thing even if this never materialized fully. It’s one possible benefit from actually having definitions that are stronger and enable more.

Having an agreed-upon definition for subtyping is pretty important for a type system (we don’t have one of these currently)

The second one opens the door for type checkers to do more with untyped code for users without it violating if not the letter of specification, the original spirit that did not seem to envision there to be a good case to.

I think people overestimate what the required changes are for this, and it’s not even my primary goal in this discussion. It is to point out potential benefits we can gain over time by having better definitions and consistency in those definitions, and that stronger definitions do not inherently conflict with the other end of the spectrum on the usage of typing.


I’m sorry, I was on vacation last week and have had to skip some discussions. In this thread there’s a lot of reference to a “set-theoretic” type system for Python (IIUC it is Michael H’s proposal?) but from the one example given (about a function that uses x = x+1; print(x)) I cannot understand the proposal (other than that it might be going back to duck typing, and proposes a technique that AFAIK is used by pytype). Maybe someone can link me to a thread or blog post where this is discussed in more detail?

The example was mine, and it was meant to counter the idea that making the type system more “set theoretic” would mean people would have less need to deal with complex types. My intention was precisely to make the point that type inference isn’t going to be able to handle duck typing, so there will always be a need for people to explicitly write types, and therefore being able to express “reasonable”[1] type constraints concisely and understandably is important.

Sorry - my example wasn’t particularly good, and my follow-up didn’t help much as I rapidly got out of my depth with the talk of set-theoretic type inference…

  1. whatever that means! ↩︎

I would say: Go for it. Why not? Put up a description of the system or, better, an alternative, improved implementation of the typing module (fixing some of the inconsistencies you have pointed to on the GitHub). Then everyone can determine whether this actually works in practice and improves the current system.

The current discussion seems too abstract (too futuristic?) to me. (It’s getting close to the level of abstraction in David Beazley’s PyCon talk - 2021 - “No, not typing. Types.” Subtitled “a tone poem” - You can find it on YouTube. Scroll to about 37:20 for the TL:TR “What was that?!” indeed :slight_smile: It’s a very interesting and amusing talk, definitely relevant to “typing”, I think, though I cannot really say I know what it was about, even after watching the whole thing.)
(Beazley also makes some interesting comments about exceptions, asked about a comparison with exception handling in Rust: there is this strange thing that exceptions in Python seem to fall completely outside the typing stuff – at the very end of the talk: 46:40.)

1 Like

I’d like to go back to the very title of this post and dig into something that was said in an earlier exchange. Paul called out the distinction between app developers and library developers – and he was under the impression that library development had been ommitted as an important use case. It seems that was a misunderstanding, and I’m sure it can be cleared up further by diving deeper at a technical level. But. That exchange was born out of a very frequent situation, in which advocates of a feature or change in typing speak primarily from the perspective of application developers and ignore or do not prioritize the needs of library developers.

I recall this in particular happening with Unpack[TypedDict] for kwargs, and again happening in a small way with StrictTypeGuard/TypeNarrower.

In pursuit of a “less divisive” future, I would like to see PEPs more directly address library author use cases. That will also have the beneficial effect of socializing the norm that “library code is also relevant”.
That could be as simple as a statement that a feature does not have obvious utility for libraries, but we are hopeful for future usages, etc. Discussing the usages for typing behaviors in these two different contexts – application vs library – is, IMO, sure to be beneficial in almost all cases.

1 Like

I was talking specifically about @mikeshardmind’s “set-theoretic” suggestions (as I understood them).

In terms of general typing, I think library development is covered fine. Sometimes I think people forget that library developers might want to define a permissive (in a typing sense) API, and doing so involves fairly “advanced” typing features. I say this because I’ve hear people state that “you shouldn’t need complex typing features in typical code” (or words to that effect). But the intention to cover library usage is clearly there, and I don’t want to complain just because people have a different perspective than I do.

In terms of @mikeshardmind’s proposal, he seemed to be suggesting that type systems should be able to infer most things without the developer’s intervention. In saying that, I feel that he had missed the library use case, and my “duck typing” example was an attempt to point that out. As a better example (I hope!) I don’t see how it’s even theoretically possible for a type checker to infer a useful type for

def low_bit(x):
    return x%2

The library author simply has to state their intent if the function is to be assigned any sort of usable type. After all, def low_bit(x:str) -> str is technically valid, but clearly not the author’s intent, here… And that brings us back to the type system having to make it easy for developers to express their intended types.


In isolation that seems definitely possible to me - and @mikeshardmind also suggested a way to do so - kind of reminiscent of the way Traits work in Rust (though Traits are not inferred). You would not infer one given native type, but a set of conformant types or traits which itself would be part of the type algebra. But… If the type inferencer has to do this in the context of a complete program, where the input x also needs to be traced back, wouldn’t the general problem quickly lead to an explosion of exponentially many possible code paths? So, could we have a program that is both correct and fast enough? I doubt that very much. But that’s a challenge to Mike :slight_smile: (As far as I understood it the “intersection” problem would come into play here but I’ve a gut feeling that that maps to 3SAT or some other NP-complete problem…)

Despite this - having a type inferencer of some sort, even if not complete, or fast, might make tools like mypy more useful. But to see that, someone needs to write some prototype…

1 Like

To be clear, I said “useful”. I would intend low_bit("%d") to fail to type check, not to infer a return type of str. Making that distinction is what I consider “not even theoretically possible” as there’s no way a type checker could know that I don’t want to include string-formatting meanings of the % operator.

Maybe I’m not being clear enough. My point is that the developer has to be the one to state distinctions like this, and to do so without over-constraining requires complex types. The type checker can’t infer stuff like this. Maybe intersection types would make expressing things like this easier, but I’m not yet convinced - I want to express “integer-like type with a % operator which will take an actual int on the RHS”, not “Any minus str”…

As to why do I want to express this? The fundamental use case I’m thinking of is low_bit(x: int) -> int, but I don’t want to prevent people from using my library in ways I hadn’t expected. So I don’t want to over-constrain, because I don’t know what my users might want to do.

We’re very close to duck typing here, and honestly I’d be fine just leaving an API like this untyped so that duck typing can do its job. But people then ask for type annotations, so we come full circle. And at this point I’m very much repeating myself, so if you’re still not clear what my point is, we’ll probably just have to agree to differ.

1 Like

It would not need to do so. The type checker would see the modulo-operator is applied. So, for this any-type-that-supports-modulo is the inferred “type” (in Rust this is a non-inferred Trait). The type-checker tries to resolve those “types” in the context of the current program (however “program” is defined). For instance somewhere else it calls low_bit(x) where x is a list and it knows that lists are not part of that “any-type-that-supports-modulo”, then it can decide there is a conflict.
If lists would support a modulo operator, then they would not be excluded.

The thing is also, if you don’t want to restrict usage, then you just don’t add type-hints. No one is arguing for no longer supporting duck-typing. But you cannot have it both ways here (you can of course in Python, I’m just being emphatic). It doesn’t really make sense to me to say: “I’ll restrict this to ints”, and also “I don’t want to exclude strings”. (Already if you do this now, you’ll run into trouble with mypy.)

So, I regard this as an unfortunate effect now of typing, that a Python programmer can indeed give only type-hints for a particular type, but wants to also give people the freedom to ignore those. I know this works, due to dichotomy between static-typing and runtime, and know it’s baked into the current system. But when I see that an external big library (pandas, numpy) uses particular types, I do see them as a contract I have to follow and which I only ignore at my own peril…

But I have never seen a concrete example of a widely-used library, like the ones I mentioned, where the addition of type-hints did not intend not constraining the supported types for function arguments.

I’m not sure, but the phrasing of your response suggests you may be missing the fact that strings do support a modulo operator. "%d" % 2 is valid and evaluates to the string '2'.