Pre-PEP discussion: typing.tool + inspect.tool_schema(), close the Zod gap for Python agent dev

mvanhorn · May 18, 2026, 6:07pm

Hi, I’m Matt Van Horn. Quick intro since this is a substantial proposal.

I co-founded what became Lyft, founded the smart oven company June (acquired by Weber), and currently ship a few open-source projects. last30days is an agent-powered social-signal search engine I use across my agent workflows (~25k stars). PrintingPress is a Go CLI generator I use for agent-native tools (~3k stars).

Smaller Python merges this year:

Cross-language method suggestions in AttributeError (PR #146407) mapping Ruby/JS/Rust/Swift names to Python equivalents
defaultdict.__repr__ infinite recursion fix (PR #145659)
Man page -X option text wrapping fix (PR #145656)

Disclosure that’s table stakes in 2026: I use AI heavily in my workflow. I am also a person, and I drove the cross-language AttributeError thread (linked above) by hand, revised the design based on @pf_moore, @storchaka, and Terry Jan Reedy feedback, and shipped. Same approach here. The proposal is mine, the citations are sourced, and AI helped me read the room before writing.

Python and the agent era

Python won deep learning. NumPy, PyTorch, Jupyter, Hugging Face, every research lab is on Python; every production inference stack is on Python. That position holds.

What’s genuinely shifting in 2025-2026 is the agent layer, the code that wraps an LLM with tools, runs multi-step loops, ships as a CLI to end users, gets embedded in IDEs. This is where the most exciting Python work should be happening. It’s not.

The bleed is real and quantifiable:

Vercel’s AI SDK has 2 million weekly downloads. It’s TypeScript. The killer property is Zod as the universal tool-contract format, one schema feeds the LLM, validates responses, types the executor, and types the calling code.
Peter Steinberger, former PSPDFKit founder, now at OpenAI working on agent tooling, is on the record: “TypeScript for web stuff, Go for CLIs, Swift for macOS/UI.” His reason for Go: “agents are really great at writing it, and its simple type system makes linting fast.” He runs 3-8 agents in parallel in a 3x3 terminal grid and writes “pretty much 100% of his code” through agents. Zero Python in his pinned repos.
Every major Python agent framework (OpenAI Agents SDK, LangChain, LlamaIndex, Pydantic AI, Microsoft Semantic Kernel, FastMCP, autogen, instructor) ships its own version of “take a function with type hints, return JSON Schema describing its parameters.” Eleven competing answers. None is canonical. New agent developers pick a framework partly based on which @tool decorator they like.

That last bullet is the crack TypeScript walks through. Zod isn’t stdlib in TypeScript, but it’s so dominant that everyone agrees on it. Python lacks both the dominant third-party answer (Pydantic, msgspec, attrs each have their share) AND the stdlib answer that would standardize the next generation.

This is fixable. Most of the building blocks already shipped.

What we already have (and what’s missing)

Python 3.14 just shipped annotationlib (PEP 749). That module is documented as “tooling for annotations”, runtime introspection at last. PEP 593’s whole purpose was to make Annotated metadata available at runtime; the design quote: “to provide a run-time API to consume metadata, which integrates with the type checker syntactically.” PEP 729 stood up the Typing Council to govern this surface. PEP 746 (authored by Adrian Garcia Badaracco of the Pydantic team) makes Annotated metadata type-safe.

The mechanism is there. The home is there. The governance is there.

What’s missing is the runtime bridge: (function with type hints) → (JSON Schema dict). Every agent framework reinvents it. None is canonical. The stdlib answer hasn’t been written.

That’s the proposal.

Proposal

Add inspect.tool_schema(fn) and typing.tool to stdlib.

Shape A (minimum proposal), inspect.tool_schema(fn) -> dict:

from inspect import tool_schema

def search_web(query: str, limit: int = 10) -> list[dict]:
    """Search the web for recent results.

    Args:
        query: The search query
        limit: Max results to return
    """
    ...

tool_schema(search_web)
# {
#   "name": "search_web",
#   "description": "Search the web for recent results.",
#   "parameters": {
#     "type": "object",
#     "properties": {
#       "query": {"type": "string", "description": "The search query"},
#       "limit": {"type": "integer", "description": "Max results to return", "default": 10},
#     },
#     "required": ["query"],
#   },
# }

Reads annotations via annotationlib.get_annotations() (PEP 749). Reads Annotated metadata per PEP 593. Reads docstrings via inspect.getdoc(). Emits JSON Schema 2020-12 as the canonical dialect. Pure introspection, does not validate, does not enforce.

Shape B (recommended extension), @typing.tool decorator:

from typing import tool, Annotated

@tool(description="Search the web for recent results")
def search_web(
    query: Annotated[str, "The search query"],
    limit: Annotated[int, "Max results to return"] = 10,
) -> list[dict]:
    ...

search_web.schema          # The schema dict
search_web.validate(args)  # Raises typing.ToolValidationError on bad LLM args
search_web(**args)         # Normal Python call (decorator is non-invasive)

Uses plain strings inside Annotated as descriptions, does NOT depend on PEP 727 (typing.Doc), which is stalled. If PEP 727 lands later, typing.Doc("...") is also honored.

Why this can land:

The Pydantic team is actively building this surface. Adrian Garcia Badaracco (Pydantic maintainer) authored PEP 746 specifically to make Annotated metadata type-safe, and is the author of the annotated-types package designed at the PyCon 2022 sprints “for use by runtime libraries.” The Pydantic team is not gatekeeping this surface, they are PEP-authoring the infrastructure that typing.tool would consume.

Jelle Zijlstra (Typing Council) is on record in Discourse t/42424: “We’re generally open to adding new utilities to typing to make introspection easier and less error-prone.” She also revealed that _strip_annotations() already exists internally in CPython as a private helper, the building block is implemented; the public API surface is missing.

Carl Meyer (Typing Council, Anaconda) unblocked PEP 649 with the “stringizer” and “fake globals” technique that made forward-reference handling viable. PEP 649 + 749 ship the new annotationlib module in Python 3.14 with the explicit purpose of “tooling for annotations.” This proposal is the natural next floor of the same building.

At the PyCon US 2026 Typing Summit, Conner Nilsen (Meta) presented research showing “type checker feedback moves agents from 79.6% success to 83.9% with 21% fewer steps” on well-typed code. The agent angle is now on the Council’s radar.

The convergence is real: eleven distinct Python libraries in the space all reinvent the same pattern. A canonical stdlib helper standardizes the layer that gets re-implemented in every framework. The cross-language case is sharper still. TypeScript developers don’t pay this tax because Zod is the answer everyone uses; Python developers pick a framework and inherit its idiosyncratic schema generation.

Why this is NOT a Pydantic replacement:

The dataclasses-vs-attrs precedent is instructive. In 2017 Guido emailed Hynek Schlawack at PyCon US, met with Hynek and Eric V. Smith, and proposed adding a subset of attrs’ functionality to stdlib. PEP 557 became dataclasses in 3.7. Both libraries coexist, most users reach for dataclasses first; for advanced needs (slots, validators, complex inheritance) they reach for attrs.

Same play here:

Pydantic does: full domain modeling, value validation, custom validators, union resolution, complex serialization, JSON output, settings management, model coercion, Field metadata, computed fields, discriminated unions. None of this changes.
inspect.tool_schema(fn) does: take a function signature with type hints, return a JSON Schema dict. Period.

Users who need Pydantic’s full feature set keep using Pydantic. Users who just want to expose a function to an LLM get a stdlib answer that doesn’t require pip install pydantic.

Hynek’s import attrs post (December 2021) flagged that the post-dataclasses story has involved “erasure and revisionism, bordering on abuse” of his work. I want to credit Pydantic, msgspec, attrs, FastAPI, annotated-types, OpenAI Agents SDK, Pydantic AI, LangChain, and LlamaIndex by name at the top of the PEP and stdlib docs. The pattern this proposal codifies is theirs; the stdlib helper just collects it.

Open questions worth discussion:

tool decorator name, @typing.tool vs @inspect.tool vs @function_tool vs no decorator at all (function-only via inspect.tool_schema(fn)). I lean toward @typing.tool because the decorator is metadata-stamping at the typing level, but the function-only path is also viable.
JSON Schema dialect, proposing JSON Schema 2020-12 as canonical output. Vendors diverge: OpenAI strict mode mandates additionalProperties: false and every property in required; Anthropic’s SDK silently strips constraints into the description field. The stdlib output is the maximally expressive form; vendor-specific adapters can live in third-party packages or future stdlib additions.
Description source: Annotated metadata (Annotated[str, "description"]) vs docstring parsing (Google / NumPy / Sphinx-style) vs both. I propose both, with Annotated taking precedence. Docstring parser style is bikeshed-worthy.
Type coverage, primitives + Optional + Union + Literal + list / dict / dataclass / TypedDict / NamedTuple feel like the right initial surface. PEP 695 generics, ParamSpec, variadic generics, complex recursive types are deferred. Pydantic BaseModel / attrs class / msgspec Struct via duck-typed delegation.
Validation depth, tool.validate(args) should be a thin schema check, not a Pydantic-level validator. Real validation (custom rules, coercion, default-filling) stays in Pydantic / msgspec / etc.
Provisional or stable, typing API changes have a high bar. I think a provisional release in 3.16 with stable promotion in 3.17 is the right cadence. Typing Council guidance welcome.

What I’m asking for:

Direction. Is this something the Typing Council wants? Is inspect.tool_schema(fn) the right home, or should the canonical version live in annotationlib (which already ships in 3.14)? Is @typing.tool worth the API surface, or is the function-only path enough? What’s the right path through the Typing Council vs Steering Council split?

I’m willing to PEP-author and drive implementation. The Pydantic-team buy-in via PEP 746 + annotated-types is the strongest political signal I’ve seen for any agent-adjacent stdlib proposal. The gap is genuine. The convergence is documented across eleven frameworks. If the direction is welcome, I’ll draft the PEP and seek a Typing Council sponsor.

If the direction is NOT welcome, happy to hear that too. Better to learn now than to spend the PEP cycle on something the Council doesn’t want.

Why I think this matters enough to take to PEP:

Python won the last era because the primitives were already in stdlib when researchers needed them. numpy.array predated the deep-learning explosion. json was already in stdlib when REST took over. The runtime introspection of types is the primitive the agent era will be built on, every agent framework already proves it. The question is whether Python ships the primitive or watches frameworks ship eleven incompatible versions of it for the next decade.

The agent dev choosing their language right now is choosing between TypeScript (Zod is the answer), Go (single binary + simple types let agents write good code), and Python (pick one of eleven frameworks, hope you guess right). I’d like that third option to feel as obvious as the first two.

References:

PEP 593 (Annotated)
PEP 649 (Deferred annotations)
PEP 729 (Typing governance)
PEP 746 (Type-checking Annotated metadata)
PEP 749 (Implementing PEP 649)
Vercel AI SDK tool calling docs
OpenAI Agents SDK function tools docs
Pydantic AI tools docs
msgspec JSON Schema generation docs
annotated-types on PyPI
PyCon US 2026 Typing Summit recap by Bernat Gabor
Hynek Schlawack, “import attrs” blog post
Discourse thread on extracting PEP 593 annotations (t/42424)
Discourse t/106632 (the prior thread linked above)

jorenham · May 18, 2026, 6:52pm

Why not implement and publish it as a library yourself?

mvanhorn · May 18, 2026, 8:00pm

It does exist as a library, eleven times: Pydantic, msgspec, FastAPI, OpenAI Agents SDK, LangChain, LlamaIndex, Pydantic AI, Semantic Kernel, FastMCP, instructor, attrs/cattrs - each shipping a subtly-different “typed function → JSON schema.” A twelfth doesn’t fix the coordination problem, it adds to it.

The dataclasses-vs-attrs precedent is what I’m leaning on: framework libraries keep the heavy validation work, stdlib provides the canonical primitive every framework currently reimplements. PEP 749’s annotationlib already landed the runtime infrastructure in 3.14.

The part I’d most value your read on: are the framework variants close enough that a single primitive could canonicalize them, or different enough that they can’t be unified? That’s where I’m least sure.

Rosuav · May 18, 2026, 8:51pm

How do you think being in the standard library will avoid this?

mvanhorn · May 18, 2026, 9:28pm

lol, great xkcd use

Counter-examples: dataclasses, zoneinfo, and tomllib all faced the same risk and landed cleanly. attrs and pytz coexist happily; users naturally split between the simple-case stdlib and the advanced-case library.

What made those work was prior convergence on the shape. The 11 schema-from-types libraries already converge: decorator, .schema attribute, introspect type hints. Validation philosophies diverge; the primitive doesn’t.

Curious if you see a reason this ends like 927 rather than like dataclasses.

storchaka · May 18, 2026, 10:25pm

attrs and pytz, and many other libraries whose clones landed in the stdlib have several properties:

They were mature and stable for several years.
They were hightly demanded by users or/and were needed for the stdlib itself.

Let wait 10 years and see what standard wins.

mvanhorn · May 19, 2026, 1:59am

Fair test, thanks @storchaka, @jorenham, and @Rosuav - the maturity argument lands. Going to park this one and watch how the agent-framework story matures. Have a few more feature ideas brewing, more soon. Thanks for the fast replies.

beauxq · May 20, 2026, 2:40pm

10 years seem like a lot to ask.

attrs wasn’t 10 years old when stdlib got dataclasses
pydantic is older now then attrs was when stdlib got dataclasses

I’ve already seen a lot of demand for a long time for what it seems Matt is asking for - mostly in the form of frustration that isinstance(["a"], list[int]) doesn’t work. So people have to pull out something heavyweight like pydantic: pydantic.TypeAdapter(list[int]).validate_python(["a"])
(or maintain their own implementation).