A more useful and less divisive future for typing?

mikeshardmind · September 27, 2023, 7:10pm

It provides a better definition for unions than we currently have, unions currently have some type checker discrepancy that was brought up at the most recent typing summit.

It plays nicely with structural subtyping, nominal subtyping, generics, and typing by value (Literals) because the subtyping relation includes them.
It also works properly for refinement types, which Literals could be considered a subset of, are not fully supported by python (ie. u16 {x int | where 0 <= x <= 65535})

It provides clear definitions, including proofs with machine verification that can be translated to python types and which also include support for Intersections and type Negation (Not), and show us where it is provably safe and meaningful to apply/compose type negation in a gradually typed type system.

As all of this has been proven academically and there are published formal proofs, I have no qualms in suggesting set-theoretic typing as a strong basis for future work and it is my belief that it can help bridge the gap between various Python audiences in at least 2 ways:

Being able to think about types as sets gives people another thing they might already know something about that gives them another way for it to “click”.
Stronger definitions let tools be as correct more quickly and be able to do more of the heavy work for users, confident they are doing the right thing by leaning on what’s proven.

While higher kinded types can be handled by set-theoretic typing, I find myself agreeing with the developers of Roc lang (opinionated) that adding them both isn’t necessary and may lead to more conflict than useful development.

Additionally, adding higher kinded types comes with a downside mentioned in roc-lang FAQ that I’ll quote directly:

It’s been proven that any type system which supports either higher-kinded polymorphism or arbitrary-rank types cannot have decidable principal type inference. With either of those features in the language, there will be situations where the compiler would be unable to infer a type—and you’d have to write a type annotation. This also means there would be situations where the editor would not be able to reliably tell you the type of part of your program, unlike today where it can accurately tell you the type of anything, even if you have no type annotations in your entire code base.

While they enable things people may associate with type systems due to Haskell, (well-typed function currying, well-typed monads) other languages that embrace very detailed type systems like Ocaml are finding more use from algebraic effects, and I think it might be a good idea for python to wait to see how it settles in the languages that are pioneering here before dipping towards one or the other, and they need to be balanced against the loss of automatic decidable type inference being possible.

guido · September 27, 2023, 8:33pm

If you understood where I’m coming from you wouldn’t rely on the academic origins of these ideas to convince people to trust them.

Skimming the discussion you linked about unions it seems that on the one hand, there’s clearly something wrong with how it’s formulated in PEP 483 (and we should try to come up with a PR that fixes it), on the other hand it feels like a bunch of people yelling at each other “you’re wrong” – “no, you are wrong, you misunderstand me” – "no, you misunderstand me.

Going back to the subject line of this thread, it feels like whatever the benefits of a set-theoretic type system, “less divisive” does not appear particularly likely at this point.

I’m happy to let it rest – we should probably focus on smaller problems where we can get agreement. Gradually improving the specs sounds like it will be useful, but it’s not the only thing. I’ll see you in the discussion about the Typing Council (eventually, I have other things to catch up on).

mikeshardmind · September 27, 2023, 9:52pm

I feel like much of that devolved because we were coming at it from both differing perspectives and differing understandings without an agreed-upon definition. There were many points of agreement throughout that, but point taken there.

Sure thing. I somewhat disagree on the focus, but only to an extent based on my belief of clearing up definitions from the bottom up. I think we’ll reach a point of reconciling everything over in that process one way or another with time.

davidfstr · September 30, 2023, 6:12pm

Nearly all the type hints I use are relatively simple. FooClass, FooTypedDict, Optional[something], str, int. And it hasn’t been a problem. I’d argue that simple type hints are pretty useful.

Do you ever use Optional[something]? I’ve found lots of code that wasn’t considering a possible None case. “Null pointer exceptions” generally just don’t happen to me in well-typed code.

Oof. Such projects are generally not a good fit for typing IMHO.

Interesting. I know the TypeScript community often has foopackage libraries paired with a types-foopackage that is sometimes maintained by independent people. Looks like there are some similar patterns with libraries like types-boto3 (which has types for the popular boto3 library).

pf_moore · October 2, 2023, 12:30pm

It probably depends on whether you’re writing libraries or applications, and on whether you want to support duck typing or not.

I don’t recall finding anything like that. But I can’t say that I’m sure, it’s equally likely that I thought of a missing None check as a “typo” or something equally shallow, and didn’t really consider it deeply.

You’re right, though. Simple cases like this are useful. But again, keeping things this simple, without abandoning duck typing, seems difficult if not impossible. So you can sort of have one or the other, but not both.

Fair. But therefore there’s a lot of code (even published code) out there that should be left untyped, and hence tools giving a degraded experience on untyped code^[1] is not ideal.

But having said that, you are saying that typing “isn’t a good fit” for a personal project (small and only one maintainer). Is that really what you meant?

which seems to be what people are suggesting is happening ↩︎

jamestwebber · October 2, 2023, 4:21pm

I think there’s a minimum size where I’d start to consider it, but I still find it useful for personal projects for a couple reasons. The main one is that it’s a concise way of documenting functions and methods for future-me. This feeds into IDE integration, which will give me a nudge when I’m probably making a silly mistake. And finally, when I take the time to write type hints it can help me think about better design for my code. When the signature starts to get really complicated it’s a sign to me that I should be breaking things apart.

YMMV of course but for me, type hints are useful even just for myself. But a lot of this is dependent on a person’s personal workflow.

timkofu · October 3, 2023, 6:31pm

Strong typing makes a programming language viable for long-term work. Coming from a (staticall,strongl)y typed language to Python without types feels like driving on a six lane highway with no lane markers. So we very much appreciate the current typing effort. A type enforcer built into the interpreter activated via a command line switch or an envvar would be superb.

davidfstr · October 8, 2023, 12:41am

Yes. If a particular person writing Python is not otherwise that interested/familar in typing, I think that it’s not especially useful for them to attempt to learn/use typing for a small personal project. However if the project later starts to become large or popular then I think it would be worth retrofitting typing in later to aid in maintainability.

I’m very comfortable with Python typing myself - as I use it at work regularly - so I tend to use typing in even my personal projects. However even then I intentionally do not use typing in small Python snippets I post to StackOverflow and similar sites.

davidfstr · October 8, 2023, 1:44am

Very interesting insight. Most of my work in using Python types is in application development work and indeed leaning toward overly-strict types tends to be the easier thing to do.

My main foray into a library that intentionally supports type checking (from many checkers) is with trycast. That library only defines 3 functions, but their signatures are quite general and rather intimidating.

Library functions with tricky signatures

trycast() - 52 line signature
isassignable() - 20 line signature
eval_type_str() - 1 line signature

Granted those functions are intended to be extremely general.

In order for the library to effectively support multiple typecheckers (specifically, 4 of them), I had to add an automated test for each typechecker and fix errors that any of the checkers choked on. In many cases I had to add checker-specific ignores in many places^[1] which took a lot of extra time and effort.

In the main trycast.py file, searching for # type: ignore gives 58 ignores in an 1086 line file. That’s ~5% of lines requiring an ignore! ↩︎

insilications · October 11, 2023, 10:20am

Different behaviors between type checkers making it harder to learn the type system.

I support the idea that different type checkers can accommodate different sets of users and needs. But some divergences can hurt. I decided to delve deeper into typing in order to contribute to a project and faced some problems trying to make the annotations work for both Mypy and Pyright. In all of these potentially fixed cases (if merged) by @hauntsaninja, Pyright respects and uses the explicit type hint. But Mypy ignores what seems to be valid typing information: Believe more __new__ return types by hauntsaninja · Pull Request #16020 · python/mypy · GitHub

Some undocumented idiosyncrasies that start to arise between type checkers can confuse people learning the type system.

gramster · October 24, 2023, 9:27pm

Just a couple of points here:

there is both a ‘go to definition’ and a ‘go to declaration’. ‘Go to declaration’ will go to a stub definition if one exists, and ‘go to definition’ will go to the concrete code, if it exists statically.
we try to be forward-looking, and we don’t see annotations in docstrings as being something that is common, consistent, or likely to be used much (if at all) in new code. Such docstring annotations are often unparseable, miss imports, are less likely to be maintained, etc etc. I did spend an enormous amount of time trying to generate stubs for Python scientific packages that at least had a consistent (numpydoc) format (see GitHub - gramster/docs2stubs: A tool to generate stubs for Python packages using numpydoc-format docstrings), and if I could go back in time I would never have done that; I don’t think it bought us much at all in the end.
on the other hand, pyright does do type inference, and parses Python code to an AST (and then a code flow graph), so we certainly do use an AST.

mikeshardmind · January 9, 2024, 1:56pm

Just going to add a recent extra example of tools being placed above users, the draft update to PEP 646 which was done because some type checkers made a mess of special casing that made something hard for them. Specification is literally being written to the detriment of users due to decisions that don’t put the user before the tool.

see PEP 646: Update draft (#1856) · python/peps@b95239f · GitHub which claims removal for that reason was fine because:

Apparently the implementation would be tricky in Pyre because of special-casing around Union. That could be evidence that it would be tricky in Mypy and pytype. Since we don’t have a specific use-case in mind, I think this is fine.

and Remove support for `Union[*Ts]` · Issue #6892 · microsoft/pyright · GitHub , the primer run for it showing that at least 2 projects were already willing to start using it at the experimental stage (and had a use for it)

Removing this from the draft was done purely because one type checker had an issue with it, to the detriment of many things for users including consistency.

erictraut · January 9, 2024, 6:12pm

Removing this from the draft was done purely because one type checker had an issue with it

You’re ignoring an important part of that statement. The authors admitted that they didn’t have a specific use case in mind. Adding functionality to the type system when there is no known use case is not a good idea, and I applaud the PEP authors for recognizing this.

This was not a case that involved consistency. In fact, the functionality that they removed from the spec was inconsistent with existing uses of Union and ran counter to the efforts to deprecate Union in favor of |. By removing this functionality, they made the spec more consistent with the existing type system.

Also, implementation complexity is a consideration that should be taken into account by PEP authors. From the time that the first draft PEP 646 was presented for general review, it took almost three years for mypy to add support for it. Pyre still hasn’t added full support, and pytype hasn’t even started to add support for it. New type system features that are too complex to implement in type checkers do not help the community. If you are involved in drafting a typing PEP, please take that into consideration.

mikeshardmind · January 9, 2024, 8:20pm

I believe I addressed that part directly. Your primer run shows there are real use cases for it. That the authors did not themselves have a case doesn’t mean there aren’t cases.

How so? Generic[*Ts] is fine, but then how do you type some return types based on this? In my mind the mistake is thinking Union can and should be deprecated, rather than thinking of | as shorthand that is more ergonomic in most cases.

Toy example below, but there are plenty of more interesting cases than this with data workflows, as your primer run reveals.

from typing import Self

class ExampleIterator[*Ts]:
    def __init__(self, given: tuple[*Ts]) -> None:
        self.given = given
        self.it = iter(given)
    
    def __iter__(self) -> Self:
        return self

    def __next__(self):
        return next(self.it)

I understand that you as a type checker maintainer personally bear a lot of the effort involved on the implementation side of this equation, but as more and more decisions favor the ease of implementation over giving users tools to express types, typing becomes increasingly more restrictive on what can be written socially. Things go from being typed as Any or untyped because the type system can’t handle it yet, to people rewriting perfectly good code when it turns into “and it never will be supported”. Is it reasonable to say the implementation agnostic specification itself should cater to how implementations have boxed themselves into a corner by special-casing things in their code? Is it not better to specify more and just let those who want the feature in their type checker of choice go help contribute to it?

I don’t make that argument lightly here, there are plenty of cases where “you aren’t going to need this” or “This should be supported, but maybe we should spell it a different way for this particular case because of [reasons]” are quite reasonable responses, but I can’t see how such an argument applies here.

Liz · January 9, 2024, 9:19pm

I initially avoided chiming in here given that this started with me writing something inflammatory about the type system on github. These sorts of problems are exactly why I felt so incensed by the situation. I shouldn’t have phrased what I did how I did, but the frustration I feel by all of these people claiming to make tools for users, and then intentionally make it so that the system can’t check certain things for users is very real.

next(iter(tuple[*Ts])) returning the result of this is impossible to type. Allowing Union[*Ts] is both the obvious answer here, and not allowed because people picked easy for tooling over possible for users.

In the interactions I’ve had with discussing python’s type system, people seem to think this is good and virtuous. You incrementally support more and that’s supposed to be a good thing right? In reality, that doesn’t work for specifications. You get ossification as implementations block eachother on compatability. How many things are in the typeshed with a comment about a specific type checker not handling something correctly? Special casing that turns into a bigger problem down the road. This becomes self-fulfilling if even one implementation does this if implementations are allowed to be a blocker for the specification of something that makes sense.

At some point if typing is going to keep up with demands from users, it needs to stop being implementations that dictate what the type system supports, and only a matter of if a type checker is compliant with a specification made to support real use.

srittau · January 9, 2024, 11:21pm

Not many nowadays. This used to be a lot worse, since mypy and typeshed were initially tightly coupled. But while typeshed tries to stay compatible with the major type checkers and tries to avoid regressing, we don’t go out of our way to work around type checker bugs anymore, except in major cases, on trivial workarounds, or because our tooling (which depends on mypy) demands it. And we do this, exactly because we do care about our users, despite what some in this thread have claimed.

This is why we have the typing council now, which works on turning the existing collection PEPs into a cohesive - and hopefully comprehensive - spec. I’m not sure what else you are asking for?

pf_moore · January 9, 2024, 11:51pm

For me, “prioritise user experience over implementation convenience” is the missing aspect. I don’t know if it’s an explicitly stated goal of the council to do this, but if it isn’t, then it should be (the point of this thread being that not everyone feels that it’s self-evident, so let’s state it explicitly).

Liz · January 10, 2024, 12:42am

Not only is it not self evident, but recent conversations with those heavily involved in typing have indicated prioritizing implementation convenience over user experience.

Jelle · January 10, 2024, 2:15pm

I think PEP 729 – Typing governance process | peps.python.org is what you’re looking for: the Typing Council is mandated to make the type system usable and useful.

However, I would caution against blanket statements that one concern should always trump another. “Implementation convenience” is important for making the type system useful, because a spec that is never implemented cannot help users. And that’s a real concern, as PEPs like PEP 646 are taking a long time to get implemented by all type checkers. I think it’s reasonable therefore to try to cut the scope of type system changes to ensure that they are implementable more easily and users can reap the benefits of the new feature sooner.

I don’t see a problem with the change @mikeshardmind cites above. The feature was found to be hard to implement and, importantly, there were no known use cases at the time, so it was cut. Now there are known use cases, so it would be good to revisit that decision. If anyone wants to re-add support for Union[*Ts], I’d encourage them to go through the process for updating the typing spec.

mikeshardmind · January 10, 2024, 6:09pm

I want to be clear here that this is actually the biggest part of the issue with this. I don’t think this is something that was well considered when it was cut, because it becomes quite obvious that the lack of use cases was because it wasn’t supported, so nobody was typing those things. The authors not seeing the use case is the problem here. It doesn’t take much exploration to realize this would be needed in some cases, see the very trivial examples above.

The type system is being bolted onto the language, and the gradual nature means people need to look at not only the things they are making now, but how this may interact with other things and things that haven’t been typable so far. I have no reason to think the authors did a good job of that here.

There are a lot of other cases like this historically (such as paramspec only supporting uses that use both args and kwargs), there are also people favoring extremes of consistency for type checkers over users even in current discussions trying to clarify behavior (see: What are the subtyping rules for tuple[T, ...]?) or the fact that we now have a whole thing with StrictTypeGuards. It’s also come up with people preemptively wanting worse behavior for users in intersections to increase the chance type checker maintainers won’t push back against it I’ve been pushing back on this where I see it and I know there’s a use case for more, but it’s happening so much with typing that I’m having trouble even staying motivated for my own use. It’s at the point where it would be less effort spent to rewrite in other languages, than to work for a better experience in python, and the effort would be entirely technical, rather than having people realize technical decisions have social impacts beyond what they personally are effected by.