Should some of the typeshed internal protocols become part of the stdlib?

sirosen · August 14, 2023, 4:02pm

Typeshed has a number of protocols which might be useful.
For example, sometimes an object is mapping-like but not quite a mapping. SupportsKeysAndGetItem is used in some cases, as are some others.

These are sufficiently interesting that I would wonder if collections is the right place to expand, perhaps with collections.protocols.

I’m not suggesting bringing in all of them. Just some of the ones which are describing existing protocols as Protocols.

You can look at the current typeshed stub, and I’m sure everyone would have different taste in terms of which ones are most useful. But perhaps a selection could be added in 3.13?

Jelle · August 14, 2023, 4:08pm

@hauntsaninja recently created a package exposing some of them: GitHub - hauntsaninja/useful_types: Useful types for Python. We can definitely think about bringing some into the stdlib, but a third-party library can have a faster release cycle and help people faster.

sirosen · August 14, 2023, 5:06pm

I think it’s cool and valuable to provide those types in a package!
(And, aside, I love the self-descriptive name. )

Because they are so small, I would hesitate to pull these in at runtime (vs defining the protocol myself).
Maybe I’d use them as an if typing.TYPE_CHECKING: ... import – I think some projects are doing this with typing_extensions to avoid a runtime dependency.

It feels… complicated to have more typing-specific packages. I don’t want to be overly negative, but I have a little trouble imagining myself justifying to the rest of my team that we should add another typing dependency.

If there were a one-stop shop – e.g. if these were included in typing_extensions – it would be easier on users. In part because the arithmetic of “is it worth adding another package to project maintenance?” is simplified.
Is it totally wild to consider features for typing_extensions which aren’t targeting the typing stdlib package?

jamestwebber · August 14, 2023, 5:17pm

It’s sort of a misdirection, but perhaps typing_extensions could have optional extras that pull useful_types etc into the same namespace. Still adds a dependency under the hood but might feel simpler for maintenance purposes.

sirosen · August 14, 2023, 6:35pm

I think it helps? It depends on what you mean by that extra, and what kind of promises / compatibility guarantees it can offer.

If I were to list typing_extensions[useful_types]>=X to my typing requirements, that does me no good if I also need to pin useful_types>=Y explicitly.
And it’s a bit of trouble (but not too hard) for typing_extensions to re-export names from some other package, so the installation and version management story would need to be strong enough to justify that work.

It very well could work – it’s hard for me to say without trying to reason more rigorously through some scenarios – but it feels like it’s skirting the question of whether or not typing_extensions can contain things other than future contents of typing.
If typing_extensions is adding extras for things, why not just add those things to typing_extensions?

hauntsaninja · August 14, 2023, 7:09pm

I’m sorry that it’s hard for you to add a new dependency, but the good news is that all of these are easy to vendor. If it helps you convince your team, note that all maintainers of useful_types are maintainers of typing_extensions and are CPython core devs.

The issue is typing_extensions is a) heavily special cased by type checkers and effectively versioned by type checker version rather than install version, b) insanely widely used that breaking changes are really costly. See Adding protocols and aliases from _typeshed to typing-extensions · Issue #6 · python/typing_extensions · GitHub for more of my thoughts.

Obviously, if something is upstreamed to CPython in 3.13, typing_extensions will backport it.

sirosen · August 14, 2023, 9:38pm

Thanks for linking that thread and sharing some context! It seems like my timing here is fortuitous in that an answer to a very similar question was reached recently.

I don’t mean to suggest that I’m in an environment in which adding packages is extremely difficult (although some people are). Rather, it’s easy to overlook the fact that package dependencies have costs, even outside of tightly controlled environments.

Every time I want a type from useful_types, I’ll need to decide a bunch of things, like whether or not I want to have a runtime dependency.
Generally these questions are easy to answer, and vendoring or “write it yourself” are always options. But it makes a package a less attractive solution for small typing utilities.

Even if it’s not the direction we’re moving for now, these protocols still feel analogous to collections.abc. As a user, it feels vaguely dissonant that Mapping and MutableMapping are defined, but SupportsItemAccess is not.

AlexWaygood · August 15, 2023, 1:41am

I would personally be open to (and would actually quite like to) add some of these to the stdlib at some point. But I think for many (most?) of these protocols, it’s too soon for that right now. We’ve had plenty of situations at typeshed where we thought a protocol or alias we were using could basically be considered “stable”… Only for us to realise a month or two later that the whole approach needed to be rethought. By adding them to a third-party runtime package, we can let users use them at runtime much more easily than if they’re just in typeshed, but we’ll be able to fix things and adapt more flexibly than if we added them straight to CPython.

srittau · August 15, 2023, 10:33am

In addition to what others have written: An intersection types PEP is being worked on (https://github.com/python/typing/issues/213), which would alleviate the need for some of these protocols as it would most likely make protocol composition easier: SupportsX & SupportsY.

RonnyPfannschmidt · September 28, 2023, 5:55am

I’m wondering, should some of those be taken as hint to turn partial ducks into fully fledged ducks

I haven’t investigated the usages yet, but I’m of the impression that it’s a good idea to investigate whether some of those ought to make use of the ABC’s instead of just minimally implementing a subset

ajoino · September 28, 2023, 10:16am

What do you mean by partial and fully-fledged duck?

RonnyPfannschmidt · September 28, 2023, 10:22am

Given a protocol, I consider a fully fledged ducks any object that fully implements it

And a partial duck any object that’s implement a lookalike subset

kkirsche · September 28, 2023, 11:10am

I’m a huge fan of leveraging protocols for duck-type heavy applications and wonder if some of the names or naming conventions from other languages may make sense to look at for inspiration. Go (golang) for example makes use of this approach through its interfaces, and common protocols are things like io.Reader which is anything that can be read from, io.Writer for things that can be written to, fmt.Stringer for things that can be converted to string via a .String method, etc.

Python is unique, so I don’t mean to suggest we should mirror another language’s choices, but they may provide good boundaries for interfaces that are composable.

sirosen · September 28, 2023, 1:28pm

I think Python has actually started to move in a direction similar to Go, as described. Protocols and types which were in typing have been moved to io and collections to namespace and organize them better.

I would like to see intersection types and a protocol per dunder in a dedicated namespace. That will make it, as others have said in other threads, very easy to write aliases for useful intersections.

However, I also think that we need to find a good way to distribute and socialize the idea of writing protocols and canonically naming them in the SupportsX style. I have taken to doing this in my own code, in imitation of typeshed and useful_types, since opening this thread and it has made a dramatic improvement to the legibility of my typed code. It’s been a huge revelation. Consider:

def build_result(errors: SupportsErrorIter, ctx: Context) -> Result:
    r = Result()
    for filename, err in errors.iter_errors(): ...

Can you guess what errors is? Can you imagine how to write unit tests for this helper?
I may have a real object which looks very unlike SupportsErrorIter. Maybe it’s my FilesContainEmailDataValidator. But because the interface is declared, I can reason better about the code than if the type were not there at all.

Typeshed initially established the legibility of the SupportsX naming scheme, and I am there for it. I’m sold. It allows two inferences for the human reader: (1) this object supports this interface and (2) this other thing is an interface!

Some will likely complain that this is “enterprise-y” or “Java-like” – both meant pejoratively – but the joy of Python still remains. Nobody will require you to write protocols, or to name them per the convention, but the tools will be there for you when and if you need them.

At the end of all that, a question: are folks receptive to and comfortable with the SupportsX naming scheme being something documented as a stylistic convention or recommendation? In the typing RTD site or stdlib Protocol docs?

EDIT / PS: I now realize I’ve gone way OT. Oops / sorry. Just wrote down what I thought. I’ll let this post stand but can move to a new thread if it seems appropriate.