A more useful and less divisive future for typing?

How could I forget that :scream: I guess the heat of discussion… :slight_smile:
(Also the operators are actually different, of course, it’s just that they share the same representation.)
No matter, certain other examples could be given… I’ve updated my post to use lists.

I agree with this, and it’s kind of where I was heading with my earlier comments. @mikeshardmind said:

If that is the case, I don’t see why anything needs to be changed to support this new type theory. It could be implemented now. And if it is, and is more useful than the existing typecheckers, then great. But if it’s not, then the changes to the standard will just be retroactively perceived as misguided. Either way, I don’t see how adding it to the standard before seeing how it’s useful would help matters.

I think part of people asking for type-hinting here would also be solved in a future where this is inferred, because on the user end, their tools would no longer yell at them, they are using it in a way that works according to the detected duck typing. (Ideally, tools also wouldn’t show a generated protocol directly, they’d show “objects that support: %, +, /” (etc). I think this would be reasonably possible for tooling to do for protocols that only describe a builtin under or intersections of them in the future.)

I think this has been an interesting line of discussion and I need to type up some more thoughts into an ideas post, but it highlights how while typing has mechanisms to support duck typing, they aren’t ergonomic and require more.

In your case of not actually wanting to support the string formatting case, but overall wanting duck typing, there’s not even an answer to this currently, you’d need Intersections and type negation to do it properly:

SupportsMod & ~str, where SupportsMod is a protocol for % support

This is a solution that composes naturally from what you described wanting, but isn’t expressible currently, nor is it necessarily obvious to those who haven’t been in the depths of typing already.

Quite frankly, nobody should be doing that much work if there’s an opportunity that a different definition would be adopted while they are working on it that undermines that amount of work. There are benefits to having good definitions outside of such an endeavor, and it is to everyone’s benefit if we build things up in order. That means the foundational definitions should be addressed first.

I you set up a plan for that, or break down the task, I imagine there might be people here who’d volunteer to take on parts of that - even though it’s not coding :slight_smile: Trying to come up with just a formal description, and do this within an established larger theoretical framework, and consider both the internal consistency and how this might be used, seems already a pretty big undertaking to me.

It’s being discussed in various places already and seems to be something which won’t happen until this or as part of it The lack of formal definitions has been a pain point for other things already.

This was intended as a point to how better definitions can enable tooling to take on more for users and don’t necessarily conflict with making it easier for users under the general sentiment of the original post that we should be working towards a future for typing that enables and considers all audiences and figure out how we can best do that.

The tangential bit on smarter inference based on results from set-theoretic typing has a way of working already (it’s a solved problem in academia) that has a reasonable runtime for any non-infinitely recursive types or program structures. I imagine this would lead to an implementation that if it got to a certain recursion depth, it told the user why it gave up on detecting the type of that variable and which variable it needs more information about. Outside of the actual correct type of json/toml etc., I can’t think of much that’s actually infinitely recursive in that manner, and each of these is built up of a few base types that providing would be enough, and that it would be detectable what the type checker needs more info about (unless we also add Higher Kinded types, then we’re in unproven territory again).

I’ll break that tangent out into another post though, as even though it could help with typing being more accessible and less intrusive while getting more accurate (hence bringing it up here), it’s a “big idea” even if it’s conceptually simple in why it would be useful.

2 Likes

I’d be interested in reading up on that. Just throwing out some links to the main literature (in a separate thread) would already be welcome (to me at least).

1 Like

Well, okay, but now I have even less of an understanding of what “a set-theoretic type system” is supposed to be. Apart from you nobody has even responded to me (even though there have been 15 or so new messages in the thread) – is there really nobody who can catch me up? @mikeshardmind could you help me out?

1 Like

@guido Sorry, I missed that in things to reply to. Set-theoretic type theory is a view of typing where most types and specifications of compatible types are treated as sets of types. This leads to subtyping being as simple as if the type or set of types is contained in another set of types. (subset comparisons) There’s been a lot of academic research and published proofs in this domain that show it’s an exceptionally good model for gradual typing, supporting everything python’s type system currently does and more. Because of the “And more” part, it would also give us a good theory to view when considering new features of the type system and how they can or cannot be safely composed. To that end, the point was by being more formal, we gain the ability to do things with the type system that are grounded in what is already proven, and we could more easily shift more of the work to tooling. (pytype as you pointed out does already do more inference in this direction, but there’s much it cannot do safely)

Good writeups of how the set-theoretic view of types translates to python came up with some of the discussion on intersections as well as in a discussion of if Never is actually consistent with other types, or only the absence of a type.

The main question that would come up if we were adopting it is in the linked bit on consistency, but to summarize:

Never is equivalent to the empty set of types in this model, and many operations reduce to simple set operations. This leaves a small question on if it should be considered a subtype or if the broader definition of subtyping in set-theoretic typing is only useful for type narrowing in this case (narrowing to no possible value, and therefore indicating some sort of error) as the bottom type is uninhabited by types that can exist at runtime.

edit: Every other existing decision is clearly compatible with it, even if needing a clarifying detail in definitions.An example of such a clarification is subclassing Any. Any in that context cannot be the top type, but can have a clearly specified meaning which matches the current meaning.

1 Like

This actually sounds like exactly what we used as the foundation for subtyping in PEP 483. That PEP even considers intersection types (as an optional future extension), and it defines Any properly. It doesn’t define Never or Not, but I think those do fit in the framework laid out there. (Though there are practical issues with Not.)

At this point I am most interested in understanding which pressing questions about the current type system you think can be answered by using the set-theoretic framework that otherwise seem problematic (especially as surfaced through differences between type checkers). Second, if it can provide guidance towards resolution of the discussion around some proposed developments like Intersection.

Is the set-theoretic perspective affected by generics? Does it matter that we have some nominal types (regular subclassing) and some structural types (TypedDict, Protocol)? Does it help guide us through the minefield of higher-kinded-types?

It provides a better definition for unions than we currently have, unions currently have some type checker discrepancy that was brought up at the most recent typing summit.

It plays nicely with structural subtyping, nominal subtyping, generics, and typing by value (Literals) because the subtyping relation includes them.
It also works properly for refinement types, which Literals could be considered a subset of, are not fully supported by python (ie. u16 {x int | where 0 <= x <= 65535})

It provides clear definitions, including proofs with machine verification that can be translated to python types and which also include support for Intersections and type Negation (Not), and show us where it is provably safe and meaningful to apply/compose type negation in a gradually typed type system.

As all of this has been proven academically and there are published formal proofs, I have no qualms in suggesting set-theoretic typing as a strong basis for future work and it is my belief that it can help bridge the gap between various Python audiences in at least 2 ways:

  1. Being able to think about types as sets gives people another thing they might already know something about that gives them another way for it to “click”.
  2. Stronger definitions let tools be as correct more quickly and be able to do more of the heavy work for users, confident they are doing the right thing by leaning on what’s proven.

While higher kinded types can be handled by set-theoretic typing, I find myself agreeing with the developers of Roc lang (opinionated) that adding them both isn’t necessary and may lead to more conflict than useful development.

Additionally, adding higher kinded types comes with a downside mentioned in roc-lang FAQ that I’ll quote directly:

It’s been proven that any type system which supports either higher-kinded polymorphism or arbitrary-rank types cannot have decidable principal type inference. With either of those features in the language, there will be situations where the compiler would be unable to infer a type—and you’d have to write a type annotation. This also means there would be situations where the editor would not be able to reliably tell you the type of part of your program, unlike today where it can accurately tell you the type of anything, even if you have no type annotations in your entire code base.

While they enable things people may associate with type systems due to Haskell, (well-typed function currying, well-typed monads) other languages that embrace very detailed type systems like Ocaml are finding more use from algebraic effects, and I think it might be a good idea for python to wait to see how it settles in the languages that are pioneering here before dipping towards one or the other, and they need to be balanced against the loss of automatic decidable type inference being possible.

If you understood where I’m coming from you wouldn’t rely on the academic origins of these ideas to convince people to trust them. :slight_smile:

Skimming the discussion you linked about unions it seems that on the one hand, there’s clearly something wrong with how it’s formulated in PEP 483 (and we should try to come up with a PR that fixes it), on the other hand it feels like a bunch of people yelling at each other “you’re wrong” – “no, you are wrong, you misunderstand me” – "no, you misunderstand me. :frowning:

Going back to the subject line of this thread, it feels like whatever the benefits of a set-theoretic type system, “less divisive” does not appear particularly likely at this point. :frowning:

I’m happy to let it rest – we should probably focus on smaller problems where we can get agreement. Gradually improving the specs sounds like it will be useful, but it’s not the only thing. I’ll see you in the discussion about the Typing Council (eventually, I have other things to catch up on).

9 Likes

I feel like much of that devolved because we were coming at it from both differing perspectives and differing understandings without an agreed-upon definition. There were many points of agreement throughout that, but point taken there.

Sure thing. I somewhat disagree on the focus, but only to an extent based on my belief of clearing up definitions from the bottom up. I think we’ll reach a point of reconciling everything over in that process one way or another with time.

1 Like

Nearly all the type hints I use are relatively simple. FooClass, FooTypedDict, Optional[something], str, int. And it hasn’t been a problem. I’d argue that simple type hints are pretty useful.

Do you ever use Optional[something]? I’ve found lots of code that wasn’t considering a possible None case. “Null pointer exceptions” generally just don’t happen to me in well-typed code.

Oof. Such projects are generally not a good fit for typing IMHO.

Interesting. I know the TypeScript community often has foopackage libraries paired with a types-foopackage that is sometimes maintained by independent people. Looks like there are some similar patterns with libraries like types-boto3 (which has types for the popular boto3 library).

3 Likes

It probably depends on whether you’re writing libraries or applications, and on whether you want to support duck typing or not.

I don’t recall finding anything like that. But I can’t say that I’m sure, it’s equally likely that I thought of a missing None check as a “typo” or something equally shallow, and didn’t really consider it deeply.

You’re right, though. Simple cases like this are useful. But again, keeping things this simple, without abandoning duck typing, seems difficult if not impossible. So you can sort of have one or the other, but not both.

Fair. But therefore there’s a lot of code (even published code) out there that should be left untyped, and hence tools giving a degraded experience on untyped code[1] is not ideal.

But having said that, you are saying that typing “isn’t a good fit” for a personal project (small and only one maintainer). Is that really what you meant?


  1. which seems to be what people are suggesting is happening ↩︎

1 Like

I think there’s a minimum size where I’d start to consider it, but I still find it useful for personal projects for a couple reasons. The main one is that it’s a concise way of documenting functions and methods for future-me. This feeds into IDE integration, which will give me a nudge when I’m probably making a silly mistake. And finally, when I take the time to write type hints it can help me think about better design for my code. When the signature starts to get really complicated it’s a sign to me that I should be breaking things apart.

YMMV of course but for me, type hints are useful even just for myself. But a lot of this is dependent on a person’s personal workflow.

5 Likes

Strong typing makes a programming language viable for long-term work. Coming from a (staticall,strongl)y typed language to Python without types feels like driving on a six lane highway with no lane markers. So we very much appreciate the current typing effort. A type enforcer built into the interpreter activated via a command line switch or an envvar would be superb.

1 Like

Yes. If a particular person writing Python is not otherwise that interested/familar in typing, I think that it’s not especially useful for them to attempt to learn/use typing for a small personal project. However if the project later starts to become large or popular then I think it would be worth retrofitting typing in later to aid in maintainability.

I’m very comfortable with Python typing myself - as I use it at work regularly - so I tend to use typing in even my personal projects. However even then I intentionally do not use typing in small Python snippets I post to StackOverflow and similar sites.

2 Likes

Very interesting insight. Most of my work in using Python types is in application development work and indeed leaning toward overly-strict types tends to be the easier thing to do.

My main foray into a library that intentionally supports type checking (from many checkers) is with trycast. That library only defines 3 functions, but their signatures are quite general and rather intimidating.

Library functions with tricky signatures

Granted those functions are intended to be extremely general.

In order for the library to effectively support multiple typecheckers (specifically, 4 of them), I had to add an automated test for each typechecker and fix errors that any of the checkers choked on. In many cases I had to add checker-specific ignores in many places[1] which took a lot of extra time and effort.


  1. In the main trycast.py file, searching for # type: ignore gives 58 ignores in an 1086 line file. That’s ~5% of lines requiring an ignore! ↩︎

5 Likes
  • Different behaviors between type checkers making it harder to learn the type system.

I support the idea that different type checkers can accommodate different sets of users and needs. But some divergences can hurt. I decided to delve deeper into typing in order to contribute to a project and faced some problems trying to make the annotations work for both Mypy and Pyright. In all of these potentially fixed cases (if merged) by @hauntsaninja, Pyright respects and uses the explicit type hint. But Mypy ignores what seems to be valid typing information: Believe more __new__ return types by hauntsaninja · Pull Request #16020 · python/mypy · GitHub

Some undocumented idiosyncrasies that start to arise between type checkers can confuse people learning the type system.

1 Like