A more useful and less divisive future for typing?

I look forward to it and hope it helps with at least some of the diverging behavior bits or lack of having a common set of terms that new things can be discussed using. The recent discussion about subtyping, Never, and so on has shown how the lack of agreed-upon definitions can hurt even discussion of the system as it exists now, let alone how one might want to change it.

Yeah, I don’t have good answers for this in particular. It seems that tooling will naturally want to use things that are supported by the language, this included, but I think the issue with this is somewhat magnified by the fact that there isn’t a good canonical typing reference right now for users to not feel like they are fighting the type system because of the choices external to what can be accomplished here, so while it’s a factor, it’s a factor we can’t directly address here, only be mindful of how other decisions may change the perception of how much that impacts users.

For me personally, yes this would be the largest disconnect. I view the lack of soundness as a place for divergent behavior between type-checking implementation to emerge, as an unnecessary lack of safety that could be provided by typing, and as a way where the type system becomes harder to teach and harder to soundly build upon because it doesn’t follow rules people may be familiar with either from theory or other sound type systems in other languages. For languages with type systems I would consider worth comparing, those would be Julia, Elixir, Haskell, Rust, OCaml, and Kotlin. Julia, in particular, is interesting as a comparison as types are entirely optional and allowed to be gradually added, while being as rigorous as all of the others there when provided. Elixir is in a similar position, but the parallels are harder to draw.

The lack of soundness in particular has led to a discussion internally at my work about either implementing our own type checker which does enforce soundness or not using Python in a specific context. I don’t hold either of these options as great and largely a product of somewhat unneeded red tape, but the decision on the level of acceptable risks is outside of my direct influence. Creating our own type-checker without the broader community buying into soundness wouldn’t help, it would only create further divergence between type-checking implementations by adding yet another implementation that does things differently.

Quite bluntly, I don’t think something should claim to be a type checker if its primary purpose is not to check for soundness based on type information with what is currently known to be sound. Anything short of that is just another linter, albeit one that can leverage type information.

That comes into a philosophical difference though that isn’t necessarily useful to what comes next, but I would hope that if someone has bought into the benefits of typing, they would see that a system that can be self-consistent because it has chosen soundness requires less work to understand as there don’t need to be special cases where “pragmatic” decisions interact with each other, and is therefore also easier to teach and harder for there to be a meaningful divergence between type-checkers.

I should also emphasize that I don’t think a sound type system requires everything to be typed or for adding type information to be any harder than it is today, I expect it to be easier because the consistency provided by a sound, self-consistent system should make it easier to teach and document the “right” way, but I do acknowledge that changing existing tools to enforce soundness would cause some projects to need to fix things that are not currently enforced, and it’s definitely something that needs to be kept in mind.

I would be very surprised if people picked rust for type safety as their first choice leaving Python unless they have some other very specific needs as well. I would expect that most people starting from python but wanting stronger benefits from typing as their primary goal would be better served by either Kotlin or Julia as both would have more similarities with their current code.

1 Like

Kevin Millikin wrote a draft specification for the basic building blocks of the type system: Python Static Types I: Subtyping - Google Docs. Should we adopt it or perhaps find a productive way to iterate on it as a community?

Since pytype enforcing soundness has come up fairly often, I collected a list of writing on the topic that I felt was interesting. Scroll to the end of the page here: Pytype’s Type System | pytype

Pytype currently opts not to enforce soundness, but the linked articles have arguments both for and against.

I guess to be more clear on what amount of soundness and consistency in the type system I value since soundness can be seen as somewhat a spectrum, maybe it helps for finding some common ground: Things which the type system has language for should be checked by things claiming to type check, and the system and tools using it should be internally consistent.

The recent example with variance of mutable fields of subclasses violated both of those conditions. (Not checking something described by the type system, inconsistent with handling of variance on list vs sequence)

In my view, the type system doesn’t have to support all possible code, even though that would be great as a long-term goal, it would require the type system to understand things that are dynamic in nature at runtime, and while we could describe the limits of what we expect of that dynamism to the type system, that seems further away from being a realistic goal, at least at this point in time. So that’s not my goal here. My “ideal” here is not that everything needs to be described and checked, but that everything that is described is checked.

1 Like

I’d like to add one more concern to your already excellent list, @mikeshardmind: I recently started looking into how to leverage type annotations for runtime validation and conversion. I realize that there is a fair number of libraries out there but many of them have limitations in expressivity or soundness I don’t particularly care for. Even though typing’s documentation is reasonably good at summarizing what we can annotate with its exports, there is no good documentation I could find on how to reason about type annotations in code. It would also seem that there are common tasks such as binding type variables that would benefit from a prominent library. Think the packaging of type checking.

@mikshardmind, thanks for starting this discussion. I appreciate the thought that went in to your post.

Python is a “big tent” programming language. It’s used by a diverse set of developers with different interests, needs, and priorities. The same is true of the Python static type system. The concerns of a library maintainer will be somewhat different from the needs of someone implementing a production web server, and that’s going to differ from the needs of a data scientist writing ML code in a notebook, etc. I think the Python community does a pretty good job balancing the needs of its diverse users. Wherever possible, new capabilities are added in a way that provides value to all Python users. In cases where a capability is targeted at a particular subset of users, attempts are made to avoid negatively impacting others who don’t care about that new feature. In cases where there are conflicts between the needs and priorities of subsets of users, compromises are usually worked out. Those compromises require people to come together and understand each other’s viewpoints and make a concerted effort to work out a solution that provides the best possible outcome for everyone. In other words, it happens through discussions like this one. So once again, thanks for starting this thread.

You said that “no mainstream Python type checker has a goal of overall type safety”. First, I think that the notions of “type safety” and “type soundness" are more ambiguous than you imply (more on this later). Second, I think that type safety is a goal of all of the mainstream Python type checkers. However, it’s not the only goal or the overarching goal. Today’s mainstream type checkers attempt to balance the goal of type safety against other goals like usability and simplicity. I will go so far as to say that a Python type checker that fails to strike this balance will never become “mainstream".

One way that type checkers accommodate different sets of users and needs is to support different modes. Let’s keep that in mind because it could serve as a potential solution as we continue this discussion.

You mentioned that there’s no good way to learn how to handle typing in Python. Can you elaborate on that thought? Are you primarily thinking of Python developers who are just getting started with static typing? Or are you thinking more about users who already know the basics about type annotations but want to use more complex features like generics, overloads, protocols, ParamSpecs, etc.?

You’ve suggested that an official reference document might be useful. When you say “reference document”, are you thinking a formal spec? I see value in creating a formal spec and I’m supportive of that effort, but I don’t think that most Python developers would find a formal spec very useful in learning the type system. Good user documentation and tutorials would better serve that need. There have been attempts to improve the user documentation for Python static typing, but progress has been slow due to limited time from volunteers. And writing good documentation is not easy.

I answer dozens of questions every week about static typing, so I have a pretty good sense for what’s tripping up users with the Python type system. The most common area of confusion, by far, is the notion of immutability, especially when it comes to the ubiquitous list and dict classes. Many users who are new to static typing are stymied when the type checker tells them that they cannot assign a “list[str]” to a “list[str | None]”. Interestingly, TypeScript avoids this issue because they chose to make arrays and maps covariant. This opens up a small hole in the type system, but it makes TypeScript much easier to learn than Python typing. That’s a good example of where the goal of “easy to learn” and “type safety” were in conflict. The TypeScript team favored “easy to learn” over “type safety” in this case, whereas the early designers of the Python type system went the other way.

The second-most-common area of confusion I see is related to type variable scoping and variance. This was the motivation behind PEP 695. Here we have an example of the typing community coming together to devise a solution that eliminates a barrier to using the type system.

You mentioned that different behaviors between type checkers makes it harder to learn the type system. Most users tend to run only one type checker on their code, so this perhaps isn’t as big an issue as you think. All Python type checkers tend to have the same behavior for basic type checks (e.g. determining whether a value with a particular type is assignable to a target variable or parameter with a declared type). Differences in behavior tend to be in areas that are more advanced and more subtle, for example in type inference behaviors or in the heuristics used to determine the best solution for a type variable when calling a generic function.

I’ve written a document that explains the main behavioral differences between pyright and mypy and the reasons for these differences. This will give you a sense for some of the many decisions that type checker authors need to make and the design goals that lead to certain decisions over others.

I mentioned above that “type safety” and “type soundness” are probably not as black and white as you think. There are some aspects of the type system that are more clear-cut such as “is type A consistent with type B?”. But there thousands of other cases where the answer is much less clear. Spec’ing all of these behaviors is probably not viable.

Here are some examples. Let’s not debate any of these here. I want to just give you a sense for what I mean.

  • Should LSP rules be enforced for __init__ methods?
  • If you have a tuple with indeterminate length (x: tuple[int, …]), should an unguarded index access (x[0]) generate an error?
  • If a class defines a property, should a subclass be allowed to override it with a simple variable or vice versa?
  • Should a type checker report type violations within a try block?
  • Should it be considered an error if an __init__ method fails to call super().__init__() in a way that allows the class to be used with multiple inheritance?
  • If a base class doesn’t declare the type of a class-scoped variable but a subclass does, should override type checks apply?
  • If a variable is assigned only within a for loop, should that variable be flagged as potentially unassigned if the length of the iterable is not statically known?
  • If a class definition uses a dynamic value (a variable) for one of its base classes, should that be flagged as an error?
  • If a class implements an __isinstancehook__, is it safe to perform isinstance type narrowing for that class?
  • If a value is assigned to an undeclared attribute of a function, should that be flagged as a type error?
  • Is it safe to assume, for type narrowing purposes, that a value assigned via a property’s setter will be returned by its getter? Under what circumstances is it not?
  • If a call expression uses an unpack operator and the length of the iterable is unknown but the callable has a fixed number of parameters, should that generate an error?
  • Should conflicts between __slots__ and class variables be reported by a type checker?

Answering many of these questions comes down to some combination of:

  • What principles have been established that cover this case?
  • What precedent have other type checkers established for this case?
  • Given what’s know about the target user for this type checker (or this mode), what behavior would be expected or desired?

Many of these decisions come down to the tradeoff between false positives and false negatives. If you have a condition that is theoretically exploitable in an extremely uncommon coding pattern but is provably safe in all other cases, should that be flagged as an error? Put numerically, if a condition is 99.9999% likely to be fine and only 0.0001% likely to represent an actual bug, would you want to see all of those cases flagged as an error in your code? Most developers would not, and they’d be annoyed if they needed to work around this error every time. What if the condition is 99% likely to be OK? 95%? 50%? Different developers have different thresholds. And most will modify their answer for production code versus non-production code.

You suggested that we should codify the purpose of the type system. I think that’s worth discussing, but coming up with a single answer may be difficult. As I said above, the Python ecosystem is diverse, and different subsets of users have different needs and priorities. It may be necessary to answer the question differently based on which group of users we’re talking about. Here’s a proposed taxonomy: “library authors and maintainers”, “developers who do not use type annotations in their code or use them sparingly”, and “developers who use type annotations throughout their code”. I’m open to other suggestions, but this is roughly the taxonomy I use when thinking about features and modes in pyright.

If we’re going to attempt to codify the purpose of the type system, I suggest that do so in terms of user value. An answer like “the purpose of a type system is to allow a type checker to enforce type safety” is hard to evaluate because it begs the question “what exactly does that mean?” and “why should a user care?”. If we frame the answer in terms of user value, I think it will help us find common ground.

You mentioned autocompletion and IDEs. Let me try to clear up some apparent misperceptions. Tools like pylance (the default language server for VS Code, which is built on pyright), MPLS (which predates pylance), Jedi (another language server which is used in VS Code and other editors), and PyCharm all use multiple sources of information to implement completion suggestions, signature help, semantic coloring, semantic rename, semantic search, refactoring tools, etc. This information typically includes some combination of library introspection, doc strings, static code analysis (inference), and type annotations. Of these, type annotations provide the best experience by far. Doc strings alone provide a very poor experience because they are often missing or incorrect, there is no standard format for doc strings, class and instance variables don’t have doc strings, doc strings can’t be accessed statically if the library is not shipped as Python source, and doc strings cannot express generic types or overloaded signatures. The developer experience gets demonstrably better if a library includes type information for its public interface. Note that this doesn’t require the library author to provide type annotations for their internal implementation, just the portion of the library that comprises the public interface.

If you’re a developer who is not interested in language server features but you are interested in static type checking, then I think you’ll agree that a library with type information provides significant value over a library with no type information. If your code imports from untyped libraries and your type checker treats all of these symbols as Any, you are going to miss a lot of potential bugs. It’s a huge boon to users of static type checking when library maintainers add type information.

I’ve interacted with many library maintainers who are in the process of (or are considering) adding type annotations. Attitudes toward static typing varies greatly across this group. Some are resistant to adding type information. Others embrace it readily and see advantages to both themselves and the consumers of their libraries. Most are somewhere in the middle. They see some benefits, but they’re waiting for the tooling to mature, and they’re taking cues from their consumers about the value of type annotations relative to other features they could add to the library.

You have mentioned that you would like Python type checking to be stricter than it is today. You’re not alone in this view, but I think it’s safe to say that you’re also not representative of most users of Python type checkers today.

I think it would be useful for this discussion if we could develop a better understanding of your motivations and needs. To the extent that you’re able to share details, how do you and your employer use Python? What requirements or needs are driving your desire for more strictness? If Python type checkers provided more strictness, what value would it bring to you and your employer?

You’ve alluded to some “red tape”, so I’m guessing that’s referring to compliance requirements of some sort? Are you able to provide any more details? Do you see value here beyond compliance, or is that the primary motivation?

Are you a library maintainer? You’ve mentioned “social pressure” to add type information to libraries. Is this something you’ve personally felt?

Developing a better understanding your needs, priorities, and motivations could help us come together and find a solution.


@erictraut thanks for the detailed reply, I’ll do my best to address specific points as I can, but if I don’t respond to something you’d like more info on, please point it out, I’d rather discuss as possible towards common ground on this. This is going to be a very long response, and if anyone skims it, I’m not going to hold it against them

To be clear on this point in particular, I don’t think the existing documentation is enough for people at multiple stages of the learning process, but it’s worse at the very beginning (think about a new typing user not getting why Sequence[str] allows str until they are bitten by it after they just changed it because list[str] wasn’t quite right, or users first trying to figure out very simple “what goes in goes out” TypeVars), and also near the edges of what either isn’t supported or is barely supported (consider complex cases with overloads typevars and generics all in play, or users who find themselves wanting HKTs). Overloads in particular are made more complex in python than in some other languages due to them being allowed to overlap and the ordering mattering.

In the “middle” where most people end up if they stick with typing, I think the documentation and tooling serve reasonably well, though there are still some ugly cases around variance that even those who work with it make mistakes on when reasoning about at times (guilty myself here) as well as generics with multiple parameters, their names not being useful enough on their own, and simply remembering what each parameter actually corresponds to (anyone that’s had to type a Coroutine and/or Generator where Awaitable and/or Iterable were not enough has probably come into this one)

I’ve been in a position of wanting to support multiple type checkers and the differences making that somewhat an unrealistic goal in the past. I have not fully reevaluated that recently, but your document still matches with one of the larger issues of inference and why I settled on pyright when it became infeasible to support multiple (not impossible, but more effort than was reasonable, and it at the time would have required either restructuring code or assertions for mypy to get certain parts correct)

I agree that both are a spectrum. I place a higher value at the correctness end of it (for reasons I’ll elaborate on somewhat below), but I’m also not rushing to go use Typed Racket, and while Elm is nice, I wouldn’t pick it for most problems.

I think it’s better to make a correct system easier to use than to make compromises in correctness because it is easier. I think getting formal correctness correct on the type system side is hard. We can make it easier on ourselves if we stick to what is correct and provable, and conversely, it is made harder when deviating on a case-by-case basis, as it removes or makes it significantly more difficult to rigorously show that a system remains self-consistent as exceptions are layered on.

When it comes to the end-user side of the equation, there’s a catchy phrase here in a related, but different domain: “Security should be idiomatic”… I feel much the same about correctness. Rather than compromising at a system level for what is actually correct because we think we may make it hard on users, we should be finding ways to make it easy for users to do the correct thing.

I don’t think that getting there (process and implementation-wise) is going to necessarily be easy, but I think it’s the best long-term goal when considering all possible benefits and a user-first perspective.

I think this makes the case for a level above “strict” as current type checkers use the term, but does not make the case that it should not be something that is checked. However, I understand that maintaining such a set of rules costs time. Part of why I think a full formal specification rooted in theory would be of more value here is that we can build on what already exists for verifying types in a system for which a machine-proof could be made. We would then have a type system that at it’s ideal could check everything it describes, but various kinds of violations could have configurable levels based on views on pragmatic errors, how it interacts with other code (public exports), and tolerance for false positives/negatives. As for the specific examples, while I have no desire to debate them at this time, I have provided my perspective on these below for the sake of comparing perspectives.

Agreed full stop. While I think that the primary purpose of a type system should be to provide the capability to analyze the type information and that one of the strongest reasons for this is to detect unsound code from a structural standpoint, answering it in terms of user value should be important here.

I want “a self-consistent type system in which everything it can describe is checked”.
From this, I want a level of guarantee about program structure and especially about library stability when it comes to those who would extend and use my code for further use. If I write something that rigorously is well-typed, my users should be able to safely extend and build upon it, but this isn’t fully the case right now. For instance, the use of type[T] is somewhat unsound if relying only on type analysis for reassurance, when this is something that should absolutely be checkable at the type system level. (__init__ and a few other issues, one of which has been recently resolved in some (but not all) of the typing ecosystem)

I can provide some additional clarity here, but if I’ve left out something or something seems to not quite add up it’s likely that avoiding a detail I cannot disclose has introduced some vagueness that I did not spot while editing this. It’s very easy to have blindspots of what “should be clear” when removing information, and it’s unlikely I will be able to further clarify.

The current state of affairs has me currently looking to (on a 2-3 year window) either move 1-2 critical projects from Python to another language after evaluating our available dev resources and the languages we have enough developers to support or use some of those available resources to improve our tooling to be in compliance while continuing to use Python.

This is in response to increasing demands on robust code that have led to those writing policy to want all projects with certain risk thresholds associated with them to conform to certain ideals. I think the level of “type system enforced safety” actually specified and that applies here (I can’t share the actual documents, but I can and have argued for things which would help here) is slightly more than we need, but still reasonably possible to provide, even in the “big tent” of Python development. I have personally argued for more than I would need under these policies as I do believe that a consistent underlying system could provide for multiple audiences.

While compliance is a motivation, it is not the sole or even primary motivator. I have options available to me when it comes to compliance, and while it appears that switching away from Python could become necessary, and I find it somewhat distasteful on a personal level, professionally I’m being paid to make the best decisions given a set of criterion, and will meet those as best I can with or without Python being a viable option.

I do see a value in the related ideals when it comes to making more robust code that is less subject to critical issues. I find this especially true when it comes to code that is internet-facing, which is increasingly the case as Python has gotten quite good at handling network-bound code with asyncio.

I can say that in a network-bound system, I’m weighing both Go and Elixir as alternatives at this point in time, even with Elixir’s type specifications being just as optional as Python’s. I’m also considering the costs of continuing to use Python and a slight restricting of code and some internal rules that restrict the allowed type specifications further to what would be enough.

There’s another system that’s part of a data processing pipeline, but we need to handle malformed data as well as potentially maliciously crafted data (each by detection and rejection). This one I personally think we already had adequate protection layered into at a logical level, but the powers that be said “no” with recent re-evaluations. Julia (again, completely optional type specifications) looks like the best candidate if replacing here as much of the code would be nearly the same, and gain some performance in the process, but I have not done enough exploration yet.

There are other various bits of code predating the updates to policy that were already not written in Python due to some other prior policies, and at least one where Python was actually specifically chosen because it could be so resilient for the given risk profile, so I’m aware of various tradeoffs and some varying ability to mitigate issues and layer on tooling to check for different kinds of issues.

I don’t maintain any widely supported public python code anymore. any non-archived code I have available publicly I do maintain, but I get very little in the way of requests on these, and these are fully typed code bases that I’ve gone through the effort on already.

I maintain several internal to my company python libraries and used to be a maintainer on a public, widely used framework. I experienced the pressure to add type information from two separate camps (those who wanted to start leveraging it for analysis, and those who wanted better IDE completions) around the time of python 3.6, but I personally would have wanted to add the type information anyway for those same reasons. I have witnessed a lot of negativity towards typing, less so recently, but there have been resurgences of it as well as people just being reminded of it feeling intrusive to them in some cases, such as in the conversation about typing.Doc.

Now, I’m going back to catch a few things that I did not think flowed well in the same order of response…

Largely needs to be addressed with the tooling maintainers, not here but...

without quoting the whole section or getting to into the weeds on this, there’s some very clear negatives that have come from it as well, though as these are not actionable here I may collect a list later of things I think there are room for improvement that I’ve seen with pylance and take it to pylance. This comment was included in the original post to show how things that build on typing can create a negative feel around typing as tools can feel dumber when it feels like something has been replaced with the new thing that should improve the situation, but where the behavior has gotten worse for how people actually use it. (a mismatch in priorities or at least outcomes as one might have it)

The big one of note in pylance is that “jump to definition” at some point switched to preferring jumping to the typeshed or other available stub over the actual definition, and there’s no corresponding “jump to implementation” to match that behavioral change. pylance also does not appear to use docstrings or the AST at all, even when typing does not exist, so while you say it typically uses a combination of information, this does not match my experience as a user. Perhaps there’s a configuration I’m missing to make this happen, but the fact that it isn’t clear to me and I’m quite invested in typing doesn’t leave me much hope that the average user experience on this is better.

While not debating the merits or tradeoffs of these or fully explaining them, I’m going to answer my ideal for your examples because you are interested in knowing where my needs are and this is something I can do. I’ve left these in a collapsible section. If you’d like to discuss further on these, I’m willing; There may or may not be a productive conversation possible on where we differ here at this point in time, if there isn’t, I hope there could be when we get further into the process. Either way, I think that for some of these to be checked also requires it to be easier for users to see an error and intuitively understand what the error is actually asking them to change.

Answers to your examples
  • Should LSP rules be enforced for __init__ methods?

Yes. If you need alternative construction, use a classmethod. If you know alternative construction may be needed in a subclass, use *args and **kwargs appropriately to support this in parent classes. I think in the “ideal” system, this would be detected always, warned at a strict level by default, with it elevated into basic if the type is publically exported due to type[T]

  • If you have a tuple with indeterminate length (x: tuple[int, …]), should an unguarded index access (x[0]) generate an error?

This would not be a type error, but could reasonably be warned for by leveraging the type information. This may fall under a deficiency in the typability of iterables a later answer will say more on.

  • If a class defines a property, should a subclass be allowed to override it with a simple variable or vice versa?

No, the semantics of this aren’t guaranteed to hold in all uses of replacement. (consider uses of type[T] rather than only T for a given type T, not necessarily a typevar T)

  • Should a type checker report type violations within a try block?

Yes. For tests, there are more appropriate methods in testing frameworks, and the runtime case when receiving an unexpected type can be handled using typing.assert_never, which gives the ability to handle code the type checker should assume is unreachable, but misuse could lead to while a user may still need to handle, such as handling for a deprecated, and later removed case still providing a meaningful error because keeping the case there has no runtime impact in the “good” case.

  • Should it be considered an error if an __init__ method fails to call super().__init__() in a way that allows the class to be used with multiple inheritance?

Yes, but this isn’t a type error and is likely a code smell for complex inheritance that should be simplified or better structured.

  • If a base class doesn’t declare the type of a class-scoped variable but a subclass does, should override type checks apply?

Yes. Class vars should be treated with more scrutiny than most things. Their mutability and nature can very easily lead to bugs or inadvertent behavior.

  • If a variable is assigned only within a for loop, should that variable be flagged as potentially unassigned if the length of the iterable is not statically known?

yes, and I’d say so even under “basic” type-checking rules. Unbound locals due to loops can go undetected for a while if 99.9% of the time you handle some amount of data and then one day you get an empty job for whatever reason.

  • If a class definition uses a dynamic value (a variable) for one of its base classes, should that be flagged as an error?

Under the current type system? Yes. I believe it would be possible to improve the type system to where some level of knowledge could still be properly retained in this case, but that, at least given the decision to allow subclassing Any, this should require an explicit cast to Any or a specific type/protocol to be valid currently.

  • If a class implements an __isinstancehook__, is it safe to perform isinstance type narrowing for that class?

No, but this is an explicit escape hatch for it, so given that, the tooling should defer to the escape hatch being used. tooling might provide a configuration to disallow it anyhow to prevent being caught unaware by a library they use using this.

  • If a value is assigned to an undeclared attribute of a function, should that be flagged as a type error?

Currently, yes, unless the type is actually a protocol that implements __call__ and that attribute. If intersections are accepted, this should become possible to type expectations more cleanly, but should still require the user to silence the dynamic behavior where first used. (for the record, that specific protocol solution exists in real world, public code)

  • Is it safe to assume, for type narrowing purposes, that a value assigned via a property’s setter will be returned by its getter? Under what circumstances is it not?

Sometimes. It isn’t in the presence of a specific getter, in which case the getter is what should be used, but I think it could be reasonable to restrict setters and getters for the same property to the same values as a complexity tradeoff on the side of tooling.

  • If a call expression uses an unpack operator and the length of the iterable is unknown but the callable has a fixed number of parameters, should that generate an error?

This is the only one here for which I am of two minds, and it’s largely due to a deficiency in the typing of iterables. Tuple is the only iterable that is allowed to be heterogeneous and sized with the type information preserved on a per-element basis. I think it would currently be a reasonable decision to warn for this, but at either strict or some hypothetical “pedantic” level above strict.

  • Should conflicts between __slots__ and class variables be reported by a type checker?



Not wanting to derail this discussion, but can you explain where the “small hole” lies here?

If we consider typescript’s behavior, but in python’s language, because there could exist other references to that list and lists are mutable, the thing expecting list[str | None] could modify something known as a list[str] appending None. Now there’s something known as list[str] in another context, but with a None at the end. Python type checkers avoid this one, and I’m glad for it.


Except for those of us who use Visual Studio Code with the Pylance language server and mypy as a separate tool (like me for most projects), there really are two type checkers providing feedback. As you describe further down in your thoughtful answer, Pyright’s output is surfaced in very different way than mypy’s. But there are two of them and I have seen them diverge before (prompting me to learn about Pylance and Pyright).


“Who is this for?"


  • Typing annotations provide the most value to long-lived Python programs that are expected to become large and/or be maintained by many individuals.
  • Type annotations prevent some of the most common classes of programmatic errors:
    • Misspelled/broken references to functions/methods and variables.
    • Failure to handle potential Nones.
    • Missing imports (especially of exceptions).
  • Type annotations provide reliable machine-verified documentation that cannot get out of date.

Long-term direction for typing


  • The type system should be extended over time to describe & characterize common patterns used in real-world Python code.
    • That is, I propose the same kind of direction that TypeScript appears to have in characterizing patterns seen in real-world JavaScript code, even if some of those patterns are messy.

Yes. There is a longstanding issue to write typechecker-agnostic HOWTO documentation.

As far as I can tell, there hasn’t been enough energy for anyone to step up and do this yet. I know I myself have not yet found enough bandwidth to write the section on TypedDict that I promised to a while back. (I have not forgotten.)

Indeed I expect disagreement here: I do not personally think that a lack of (perfect) soundness disqualifies a type checker as being useful, as I have argued before:

Also, as Eric mentioned, the definition of “sound” is not exactly black & white.


I’ve been watching this and similar threads scroll by, not able to really to engage with the discussions in any meaningful way, but still finding my concern growing with each new thread. I will add my 2¢ from the peanut gallery though.

As someone who spent most of his career (now retired) writing Python code which was only dynamically—but strongly—typed, I view type annotations with some distaste. I’ve experimented with them in small ways just to get a feel for the basics, but in the end, have turned away from them. They definitely feel grafted on to me. When I read some questions people post about them I think to myself, “What in the heck are they trying to do? Is this complexity really necessary?” As a recent reply in this thread indicated, type annotations will be most beneficial on large, long-lived, multi-person projects. I suspect though that there are plenty of instances where they buy the programmer little or nothing which wasn’t present in such subscription concrete ways.

My understanding was that type annotations were always supposed to be optional. Technically, I suppose they still are. However, as tooling shifts to demand more and more metadata be specified through annotations or open source projects require type annotations in all submissions, I fear it has become optional in name only. I think at some point a fence will need to be built around type annotations to prevent them from infecting the entire ecosystem.

Fredrick Lundh (RIP) once commented that for him, version 1.5.2 was the perfect version of the language. Everything which came after that was just so much extra baggage Maybe he was kidding, but I kind of doubt it. I’m sure I’m misremembering his statement to some degree, but I still think the general sentiment holds. Python used to be this almost tiny language a person could learn in a couple hours from the tutorial, then hold in their head without any further effort or reference, the perfect first programming language. The complexity was in the rich set of libraries which could be absorbed piecemeal, as one’s needs evolved. Such seems to be no longer the case.

Maybe, like JavaScript, someone will eventually feel obligated to write a book entitled Python: The Good Parts. If so, I hope they leave out static type annotations.


I just want to mention another use case here, important to me personally. I design programs starting with types. Whenever possible I use a language’s type system to model the problem domain first. This explores the entities involved, attributes, relationships, cardinality and so on. I’ve found this to be a very productive way of designing programs of any size, as long as the type system is sufficiently expressive. Although Python doesn’t have algebraic sum types, I find I can get a long way through a design with dataclasses, enums, unions, and the occasional shallow class hierarchy.

After I’ve created some data model, functions and methods define the interactions and transformations between types. So for me, in addition to all the other use cases mentioned, type annotations serve as a design system.

It’s just my personal preference, and perhaps not that common of a use case, but I feel I don’t see this mentioned much in discussions about the purpose of type annotations or specific typing features.


We don’t have a fence, but we do have a shed: GitHub - python/typeshed: Collection of library stubs for Python, with static types
If a library doesn’t wish to add type annotations, but is facing pressure from users to add them, a reasonable compromise is that anyone can contribute stubs containing annotations to typeshed where they can live separately.

Over the years, I’ve read several posts in this vein, and it’s always unclear to me how much they’re speaking from abstract fear vs concrete experience. If it’s the latter, I’d love to have a more constructive conversation about those experiences.


We don’t have a fence, but we do have a shed:
GitHub - python/typeshed: Collection of library stubs for Python, with static types If a library doesn’t wish to
add type annotations, but is facing pressure from users to add
them, a reasonable compromise is that anyone can contribute stubs
containing annotations to typeshed where they can live separately.

Over the years, I’ve read several posts in this vein, and it’s
always unclear to me how much they’re speaking from abstract fear
vs concrete experience. If it’s the latter, I’d love to have a
more constructive conversation about those experiences.

A decade+ old (but still very actively developed) million-line
codebase I help maintain had some contributors eager to typify it a
few years ago. I and other maintainers were skeptical but were won
over by claims that it would make things more maintainable and
newcomer-friendly, so agreed to the experiment.

For typing to be at all useful, large swaths of the codebase had to
be refactored. The resulting changesets were massive, perpetually
merge-conflicting with the branch state every time something new
merged, and causing many other pending changes to merge-conflict as
soon as a new chunk of typing changes landed. Before any significant
progress was made, behavior changes in mypy meant that the work
which had been merged needed to be redone.

The community around the software in question is only relatively
tolerant of linters on a good day, with an overall view that having
working code is more important than making sure every last thing
about how it’s written is consistent. Needless to say, a constantly
shifting target for what typing should mean resulted in maintainers
feeling they were spending more time reviewing and dealing with
no-op changes to the code than doing productive development.

The contributors who were attempting the sisyphean task of typing
the entire codebase finally got bored, as people chasing whatever
the latest new-shiny tend to, and this experiment ended in
incompletion. The project declared it a failure, and is in the
process of “cleaning up” any typing-related content whenever a file
with some still in it gets new edits. We determined that if people
come along in the future trying again to convince us the code should
be typed, we’ll recommend they put any typing they want to apply in
separate files and update it themselves whenever the software
changes so we won’t have to deal with it.

I’m sure typing is useful for new projects without a lot of
established code, but trying to retroactively apply it to a large
project that wasn’t written with the expectation of being typed
seems to be a major pain-point. At least, that’s my experience.


I generally agree with these comments. There are lots of independent threads that seem to generally be talking about the same thing and they are moving fairly quickly. It makes it hard to keep track all of the comments and what is the actual focus of each thread.

There is no doubt that when type annotations were added, they were optional. And from “the languages” perspective, they are not required for code to run. However, since then standard library features have been added that use/support them (ex: dataclasses & singledispatch). Additionally, some libraries have started using them as a way to control the runtime behavior of the libraries. There has been growing momentum in the larger ecosystem that type annotations are beneficial and should be further improved [1].

This is where we diverge. Maybe this is because we have had different career paths. I started around when 3.1 was released, but was using 2.7 [2]. Since then, I’ve used a variety of languages, both dynamic (javascript & lua) and static (java & c++). But python has been the majority of my career.

In that time, the vast majority of the code I’ve written in dynamic languages has been dynamic in name only. The times I’ve really taken advantage of the awesome amount of flexibility python gives you, it has either been:

  • hidden away underneath an abstraction layer
  • a less formal form of coding to an “interface”

Type annotations allows me to:

  • document the expectations of what my code expects to be passed and will return
  • more easily understand the libraries that my code uses.
  • have a tool the reminds me when I write code that doesn’t confirm to the contracts.
    • With the extra benefit of there being one less thing to think about reviewing a PR

The idea that documenting the type of something is useful goes back before typing was introduced. For example, Sphinx supported rtype before the 1.0 release (over 13 years ago and 5 years before typing was added).

I do think we are at an important time for typing. Having multiple type checkers that don’t always agree is not great. Having mypy still not fully support LiteralSting is not grate. Decisions like the one about TypeGuard are complicated because there is no one clear answer for “is it better to break compatibility because time has shown us that the initial definition isn’t the best instead of 'I know you found TypeGuard first, but what you actually want to use is FooTypeGuard?”.

My current feeling is typing does need to move towards being less fast and loose. I know first hand that there are certain things that python allows, and people wouldn’t consider bad code, which are hard/not possible to express with the current state of typing. But some of these things have complicated edge cases, so it is important to be deliberate about what will and won’t be supported.

  1. Though I haven’t seen anything to say if this is a majority or a vocal minority. ↩︎

  2. because that is what Ubuntu provided ↩︎


I believe Python is still the perfect first programming language, but feel very sympathetic to Skip’s unease. Initially when type hints were first supported I was pretty enthusiastic about them, since in larger projects, or when working with several projects, I found they can make code a lot more readable. So, I don’t dispute their usefulness. (More than that - in large projects, where you also want to rapidly iterate, they are probably already indispensable. If you agree with that, then I think it is not a far stretch to imagine that someone in the not-too-distant future will propose to make them non-optional.)

The actual, gentle (insidious? :slight_smile:) effect that using type-hints has had on me is that it has made me fall in love again with another language – a strongly typed language, where type info is not just something that has meaning during static type-checking but has meaning also at runtime: Rust.
I still love Python - but use it now mainly for fast prototyping.


Yeah, that sounds terrible. The #1 piece of advice in mypy’s documentation on how to type an existing codebase is to a) start small, b) get mypy running before adding annotations: Using mypy with an existing codebase - mypy 1.7.1 documentation. Ideally your would-be contributors would have read this.

The #2 piece of advice in that section is to pin mypy and other dependencies so that everyone runs the exact same analysis everywhere. That said, if you have examples of major behaviour change you think would be instructive that you can link me to, I’d be interested. In the last two years, I introduced GitHub - hauntsaninja/mypy_primer: Run mypy and pyright over millions of lines of code which has helped typeshed and mypy become acutely aware of their disruption budget and has helped eliminate major unintentional behaviour changes.

That said, if a project doesn’t have the bandwidth to review changes or isn’t tolerant to linting, adopting statically checked types is unlikely to be a good fit. I’m glad you ended up in a place that’s comfortable for you and your co-maintainers, thanks for sharing!


Sorry, I should have elaborated on my fence idea. I think we already have a suitable way to document parameters in a function’s docstring. (Something understood by Sphinx? I no longer recall.) My understanding is that Visual Studio no longer uses that info, but relies only on type annotations when prompting for completions.

Maybe I misunderstood what I read. I would like it if someone could explain precisely what’s changed w.r.t. VS. Given the source, it’s likely to be a very popular IDE, so if the VS authors have adopted one way of gathering parameter type information and now ignore the former scheme, users of that older scheme are stuck in two ways as I see it:

  1. The authors need to potentially document their parameter details twice.
  2. Despite perhaps not being interested in type annotations, they are more-or-less forced into it if their code base is to play nice with VS.

Happy to find out I’m way off-base. Even if I am, I worry that popular tools effectively make type annotations non-optional.

1 Like

Personally I find all but the simplest type hints rather verbose, and as a result generally harmful to readability. And of course, you usually don’t want simple type hints, because simple type hints are often over-restrictive (think list[int] when what you actually want is something like Sequence[IntegralButItDoesntNeedToSupportRationalAttributesLikeDenominator]).

And in reality, I’ve never (as far as I’m aware) found an actual bug as a result of adding type hints - which maybe means I’m just lucky (or careful), but does call into question the value of those type hints.

I’d sort of like to be more enthusiastic about type hints, but honestly, the cost/benefit ratio is pretty poor, in my experience.

However, my experience is basically for projects that are either small, have very few maintainers, or both. The trade-offs will be different for larger projects. But that’s not something that gets taken into account - helpful contributors still want to submit PRs adding type annotations, or create issues requesting that I add them. And saying “no” is emotionally draining, because it feels like you have to justify not wanting type annotations, because “obviously” they are a good thing…

One thing that would be useful is if it were easier for projects which do use typing to integrate with libraries that don’t. Maybe that’s typeshed, but I imagine there’s a limit to what projects get added to typeshed. So maybe having a way for project A to define its own stubs for a library it uses would be useful (maybe such a thing already exists?) If that was a more accepted route, maybe the social pressure on everyone to add type annotations would be lower?