Proposed new typing governance process

I propose a new way to govern the Python type system: a committee that maintains a
specification and conformance test suite.

Motivation

There are a few interrelated problems with the way we currently manage the Python type system:

PEPs are the only specification

The Python language reference covers the symbols in the typing module, but does
not (and should not) go into detail on how the full type system should work.
But PEPs aren’t meant to be living documents; they are change proposals.

It’s hard to clarify the specification

Because the PEPs are the only specification we have, anything that could be seen
as a change to the specification would theoretically require a new PEP. But that
is often too heavy a process for a small change.

The type system is underspecified

While the PEPs provide a specification, they are often not sufficiently precise
(sometimes intentionally so). This is especially true as the combinatorial
complexity of the type system has grown.

It ends up falling to individual type checkers to decide how to navigate
underspecified areas. In cases where type checkers informally coordinate, this
results in de facto standards that aren’t clearly recorded anywhere, making
the type system less accessible to newcomers.

The Steering Council is not well-placed to solve the above problems

The SC has the entire language in its remit, and is not well-placed to make
decisions that are purely about the type system —- if only because they don’t have
the time to deal with type system arcana alongside their other responsibilities.
This is similar in spirit to the reasons why the Steering Council sometimes uses
PEP delegation.

Typing Council

We propose the creation of a new group, let’s call it the Typing Council. This group will
be responsible for developing and maintaining the Python type system, and
responsible for solving the above problems.

The “operations and process” section describes how this group would operate and
be governed.

The more exciting “projects” section describes solutions to the above problems
that the Typing Council could shepherd.

Operations and process

We’re open to changing the details here, but this is a possible description of how
the council would operate.

The council has three members, which are appointed by the Steering Council among
prominent community members, such as Python core developers and maintainers of
major type checkers. Normally council members serve one-year terms, renewable
indefinitely, but they may resign or be removed by the Steering Council at any
time. (Removal by the Steering Council should be an exceptional situation in
case of major misconduct, inactivity, or similar situations.)

The council would operate primarily through reviews of GitHub PRs. Regular
meetings are likely not necessary, but the council may set up video calls, a
private chat, or whatever other mechanism they decide upon internally.

The council should aim for transparency, posting all decisions publicly, with a
rationale if possible.

Relationship with the Steering Council

Just like today, the Python Steering Council would remain responsible for the
overall direction of the Python language and would continue to decide on
typing-related PEPs. However, smaller changes to the type system could be made
by the Typing Council directly, and the Steering Council may delegate the
decision on some PEPs to the Typing Council.

Some examples of how past and recent issues could have been handled under this model:

  • A PEP like PEP 695 (type parameter syntax),
    which changes the language syntax,
    would be decided upon by the Steering Council, but they could consult the
    Typing Council for opinions or endorsements. Similarly, PEPs like
    PEP 702 would be decided upon by the Steering
    Council, because it expands the type system beyond pure typing. Other examples we
    expect would be decided by the SC include PEPs like
    PEP 718 and PEP 727.
  • A PEP like PEP 698 (@override),
    which affects only users of type checkers
    and does not change the overall language, would likely be delegated by the SC
    to the Typing Council for a decision, exactly as any other PEP delegation.
    Other examples we expect would be delegated include PEPs like
    PEP 647, PEP 655,
    PEP 673, PEP 675.
  • Adding a smaller feature, such as Never as an alias for NoReturn, would be
    done by means of a PR to the spec and conformance test suite. The Typing
    Council would then decide whether or not to merge the PR. They may ask for the
    feature to be specified and discussed in a PEP if they feel that is warranted.
  • If there is confusion about the interpretation of some part of the spec, like
    happened recently with partial stubs in PEP
    561
    ,
    somebody would make a PR to the typing specifications to clarify the
    spec, and then the Typing Council would decide on the spec change.

Projects

Here are some efforts a Typing Council would be responsible for.

Conformance test suite

A conformance test suite would provide machine checkable documentation for how
type checkers should check Python code, accompanied by the results of major type
checker implementations on the test suite. A rough sketch for what this could
look like is GitHub - hauntsaninja/type_checker_consistency.

This would contain prescriptive tests from behavior prescribed by previous PEPs
and descriptive tests that let us document behavior of existing implementations
in areas that are not prescribed by any standard. These descriptions would be
useful to inform efforts below and to identify areas of focus for
standardization.

Specification for the type system

A specification could initially be created by stitching together the
specification sections from the existing PEPs, and then gradually improved to
clarify points of confusion and cover more areas.
Kevin Millikin’s document on “Python Static Types”
could provide a basis for formalizing much of the spec.

The specification has a few audiences:

  • For type checkers, it provides a description of how an idealized type checker
    should behave. Individual type checkers have different goals and technical
    constraints and they are free to deviate from the spec if they do not have the
    resources to fully implement it or if they believe a different behavior better
    serves their users. However, they should document such deviations from the
    spec.
  • For projects such as typeshed, or libraries that want to be compatible with
    multiple type checkers, it provides a set of rules that they can follow to
    make their code understood by type checkers.
  • For people who want to propose changes to the type system, it provides a
    foundation for any new proposals.

There are different opinions within the community about how formal such a
specification should be. This document does not aim to resolve those
disagreements, but it provides a process that would enable the creation of a
spec at the community’s desired level of formality.

User-facing reference for the type system

Documentation is important for the success of the Python type system, so
the Typing Council should ensure that there is good documentation for the
type system.

As mentioned previously, PEPs are point in time change proposals aimed at
multiple audiences that are hard to clarify. This makes them ill-suited as user
documentation. The specification discussed in the previous section would
be a living document, but it would likely be too technical to serve as
documentation for normal usage.

Therefore, a separate user-facing reference for the type system would be
useful. Such an effort could expand the documentation on
Static Typing with Python — typing documentation.

Path of this proposal

This is an informal writeup. I previously shared it privately with a few people (including @hauntsaninja, who helped to greatly improve the text—thanks!), to a generally positive reception. If it is received well, I will convert it into
a PEP and approved by the Steering Council. Then the SC will appoint the initial
Typing Council, which should then start putting together the spec.

30 Likes

Thank you for the thorough write-up!

A couple of process questions:

Would you see the specification for typing live within the Python documentation (and hence have an implicit versioning by Python version), or eg live within the Python/Typing repo? This might play into some of the backwards-compatibility points @pf_moore raised on the type guard thread.

Is the tooling remit of the new committee limited to the major type checkers, or is talking to eg IDEs, considering impacts on eg typing-adjacent tools/libraries, etc included?

A

I think it makes more sense in a separate repo, probably python/typing. The type system specification itself should not depend on the Python version; I would not want a situation where TypeGuard means one thing in Python 3.11 and another in Python 3.12. We have the typing-extensions library that has the explicit goal to make typing work mostly the same across Python versions.

Of course there are aspects of typing that differ across Python versions, due to new syntax. For example, the spec would likely say that A | B and Union[A, B] mean the same thing, but for completeness it may mention that the former will only work on Python 3.11+ (or within type annotations with the future import).

My first instinct is to say that any users of the type system should be included. For example, IDEs should refer to the spec on how to interpret typing information from libraries, and runtime type checkers like Pydantic should look at the spec for how to interpret types at runtime. However, I’m curious to hear other perspectives here.

3 Likes

I think stitching together the PEPs will not quite work. We will end up with a document which unclear, ambiguous and hard to iterate on.

I would recommend to instead start small and gradually expand, so that at every point the specification only covers a subset of the type system, but it is self-consistent and accurate wrt that subset. Kevin specified most of PEP-484, with function subtyping being the only notable omission, so his draft does feel like a natural starting point.

1 Like

I see the appeal of that approach, but I don’t think it is realistic. Formalizing the whole spec is a lot of work that a lot of the community may not have an appetite for. Kevin Millikin’s existing doc is a good foundation, but it’s already 25 pages long, does not cover Callable or the many features added since PEP 484 (e.g., Protocol, TypedDict, Literal, ParamSpec, TypeVarTuple), and is sure to attract a lot of disagreement.

Starting with the concatenated PEPs will make the specification as good or as bad as the current state, but it quickly gives the Council the opportunity to start amending specific unclear areas, such as the PEP 561 confusion around partial\n, that I cited above. That feels, to me, more valuable to users than a fully formalized spec. If and when we are ready, we can start replacing sections of the spec with more formalized text.

3 Likes

I started creating the draft “stitched-together” spec: GitHub - JelleZijlstra/typing-spec: Draft specification for the Python type system. I’m going over the PEPs chronologically, up to PEP 570 now. Contributions to add more PEPs or improve the flow of the document are welcome, but no substantive changes to the spec should be made until the Typing Council actually exists.

I understand your rationale, but I don’t think there is any shortcuts on the path to a good spec. It will take at least the same amount of work (probably more!) to get a stitched-together version to the level Kevin’s draft is at.

As for disagreement, I think this inevitable and in fact “a good thing”. How can we continue evolving the type system if we cannot agree on how the fundamentals of the type system work?

1 Like

I also think it’s better to start with a formalization. There are known deficiencies with the PEPs that have come up quite a bit recently. By stitching together the existing PEPs as a starting point, we’re only starting with definitions we already know are deficient. If we start with a formalization effort, nothing has to change or be accepted prior to an appropriate council approving it, the status quo doesn’t change until such a time that a formalization is approved, but it places the necessary effort in the right place from the start.

1 Like

It’s also worth being explicit about what audience these specifications are for. Implementers of type checkers? Python users who want to annotate their code? While accurate and complete documentation is good for everyone, there are trade-offs in how things are presented that depend on the audience.

If we’re not clear on who the intended audience is, interested parties won’t be able to meaningfully offer feedback, or know if their feedback would be useful.

3 Likes

I think that’s actually covered kindof in the original post, but not exactly resolved, infact…

I personally think this defers this for too long, I’d rather ask the steering council to approve an intentional pair of resources to be under the purview of this new typing council, knowing we have multiple audiences, as this sets a clear path and intention to support multiple audiences:

  1. The formal specification. This is the source of truth for how the type system currently works and should remain up to date with all changes as a living specification. This is for implementors, be they type checkers or people adding to the type system and needing to remain consistent with the existing system. It may be very technical and not intended as “light reading”

  2. End user documentation. This can include summaries, explanations of common patterns in typing, frequent misconceptions, and even a section on things that the type system does not currently cover. Any prose documentation on the type system would go here. Implementors may benefit from this, but the primary audience for this would be end users and ensuring they have the tools and knowledge needed to work with the type system, rather than feel as if they are fighting it.

It might seem obvious that that would already be the case, but I think being explicit in goals here may help ensure no audience is left out.

7 Likes

As a general comment, I really like this proposal, so Jelle, thank you for putting it together!

My 2c on stitching the PEPs together to create an initial specification: I’m in favor of it. I think we need community buy-in for any standardization / formalization effort to be successful, which means that it would be a good idea to demonstrate some value before asking for a lot of up-front effort. So I like the idea of capturing the current state of the world in a doc that can be improved (or entirely rewritten!) in parallel to developing a conformance test suite, which I see as the practically useful piece that will show the value of having a precise spec.

9 Likes

There’s a lot to like in this, but I especially love the idea of the conformance testsuite.

On that subject, a question:
The type_checker_consistency repo has some notes on requiring typing PEPs to include tests. I think those suggestions are great, but I don’t see them reflected in this thread. Is that idea under consideration right now? Or perhaps do we need the testsuite to be established first?

I’m also curious if the testsuite would be an appropriate place to record known-inconsistent behaviors, or if they should be put somewhere else. As a concrete example (of no particular importance, there are many options like this):

# mypy considers A to implement P
# pyright does not: Parameter name mismatch: "x" versus "y"
class P(Protocol):
    def foo(self, x: int) -> int: ...
class A:
    def foo(self, y: int) -> int:
        return y

It might be nice to collect short snippets like this to be able to record differences in a very concrete and digestible format and decide which ones are worth addressing. As a user, it would be awesome to be able to help build out the knowledge base regarding these behaviors.

1 Like

Glad you like the idea of building out a test suite!

curious if the testsuite would be an appropriate place to record known-inconsistent behaviors

Yes, I very much intend this! I need to put in a little more basic work, but please feel welcome to make PRs. While I intend for the initial focus to be descriptive, not prescriptive, I’m hopeful this will be a good way to identify behaviour that could benefit from standardisation.

notes on requiring typing PEPs to include tests … suggestion is not reflected in this thread

Well, I’m the author of that suggestion, so you can assume I think it’s a good idea :wink: And certainly, any Typing Council would be happy to encourage additions that test new specifications.

But a “requirement” is the introduction of a specific process, and in many ways, I view this Typing Council proposal as a means to figure out and iterate on the processes that work best for the community.

1 Like

My leaning would be stitching PEPs together into document with sections/subpages is valuable, but more as something that evolves to user facing documentation instead of as type system specification. The current PEPs I view in between in detail/formality of ideal type system specification vs user document. But treating it as proto-user document and then simplifying it feels better then trying to add/adjust it into a more precise specification.

For type system specification I’d also prefer building test suite first not after. We already have a reasonable model/picture of type system in our heads from current PEPs + type checker implementations. To work out a more complete/precise specification instead of directly jumping in and writing it, feels easier to start by building up test cases of behaviors we agree on. These could be initially copy of mypy/pyright/pytype/pyre test suites and then removing duplicate cases/unclear cases.

Mypy test cases that pyright fails and vice versa would be very interesting and good starting area for discussion. As conformance test suite grows it will then be easier to write down rules consistent with those tests and determining specification from there.

1 Like

Agreed. And while we’re on the subject of goals, maybe it’s worth being explicit about the context, as well. Up to now, typing has very much been in an “experimenting, innovating and developing” phase. And that’s helped it get as far as it has. But now that typing is mainstream (it’s used as a key component in the workflow of a significant number of projects, and essential to the UX of many developer tools) maybe it’s time to move into a “stabilisation and consolidation” phase. The focus on consolidating specifications and conformance feeds into this, and I think that by explicitly making this a goal, we can direct energy towards it, and maybe also rein in the continual development for a while, taking time to give the documentation effort a stable base to work on.

3 Likes

Let me see if I can summarize what is being proposed:

Comments:

  • I think it would be useful to have a standing (set of) individuals that the Steering Council could consult on typing-related questions & decisions, such as a Typing Council :+1: (+1)
  • I agree it’s important that a user-facing reference for the type system be created.
    • Newcomers to Python typing have no clear single place to understand how to use typing effectively, except per-typechecker documentation such as the mypy documentation.
    • For the initial version, I don’t know whether such a Typing Council would seek to put together that reference themselves individually, or delegate to others.
    • Presumably future Accepted typing PEPs would be responsible for updating this documentation as part of implementation work.
  • A conformance test suite would certainly help stabilize typing behavior across checkers, where such stability is being increasingly asked for. :+1: (+1)
    • For the initial version, I don’t know whether such a Typing Council would seek to put together the test suite themselves individually, or delegate to others.
      • I expect creating the initial version would be a massive untertaking, requiring effectively replicating the entire test suite of one or more existing type checkers.
    • Presumably future Accepted typing PEPs would be responsible for updating this test suite as part of implementation work.
  • I’m a bit less convinced that a grand (prose) specification of the typing system would be useful. It’s not clear to me how this would be different than just concatenating all of the existing PEPs together. And who would be the audience for such a specification exactly, that couldn’t already get the same information from a PEP? :thinking: (-0)
    • It seems to me you could probably just have an index containing each typing feature and link to the associated PEP.
      • In fact we already have an index of typing-related PEPs, although I think an index that was reorganized to group PEPs by common feature, rather than just being a simple table, would be more useful.
    • The one benefit I can think of to having a distinct specification of the typing system is that such a specification could be updated more easily than a PEP, which is generally immutable after acceptance.
3 Likes

I think that might be what we call the “1.0” testsuite.

The initial version should be very much smaller for a number of reasons, most importantly “inertia and startup cost”. But also we may find that the structure of the tests needs to change to accommodate everyone’s needs, and such adjustments will be much easier while the suite is small.

I have some past exposure to this kind of cross-implementation testsuite via JSON Schema, and it works extremely well for them. Their spec is explicitly versioned, and there are other differences, but there may be useful takeaways from their prior art. Notably, new spec versions are written with test cases as part of the process, and it’s a great way both for implementations to validate and for the spec authors to get some confirmation back from the implementers.

1 Like

Overall concept sounds like a good idea to me, but I see a potential middle ground between “only initially specify a subset” and “bulk copy & paste from the PEP specification sections”: adopt the approach of building out a self-consistent specification from an initial core, but include a stub section with relevant PEP references for every topic not yet covered in clarified prose (this approach could be used for both the implementor facing formal spec and the user facing docs).

Expanding on each such section might start with a PR based on the text in the relevant PEPs, but could be tidied up and better integrated with the rest of the text in the initial PR review.

3 Likes

As someone looking to learn all the caveats of python’s type system, and as someone who has been burned enough times by a piece of code being an error in one type checker but passing in another type checker, I am really excited about the conformance suite. It’ll make it really easy for people to find outlier examples and contribute to the type checkers.

If possible, can this test suite have a surplus of comments explaining test cases and their rationales? That would be an amazing resource.

4 Likes

Nothing more than a big :heavy_plus_sign: :one:, but -

As someone who leans on the type system (and a maintainer of libraries with a heavy reliance on it) and with what feels like a great increase in the usage of typing throughout the community, the idea of having streamlined specifications and consistent, intuitive usability is exciting.

Also (hopefully) with the added benefit of fostering collaboration and broader adoption of utilizing types among the Python dev community.