Regex type hint

Abstract

We already have the typing.Literal type for specifying that a value between a set of constants is expected. However, sometimes we expect a value to follow a certain pattern and, for that reason, this topic is proposing the implementation of a new type. This new type would indicate a string following a certain pattern is expected.

Motivation

The current available types does not provide a way to specify that a string value must follow a certain pattern. str specify any string value is accepted. typing.Literal specify that it expects a value between a set of constants. Although, sometimes a string following a specific pattern is expected, and there is no way to indicate that in the signature of the function.

Proof of that lies on the pydantic.Field and the libraries built on top of it. Since there is not way to specify the string format, pydantic.Field must provide a regex parameter in its constructor to validate the passed values.

Rationale

For the current types, there are types receives optional generic types. For example, we can have the list type hint, which represent a list that expects any value. Although, list[int] expects a list with int objects.

The same way we could have a str type hint, that expect any string, but a str[r"^[A-Z]+$"] would expect a non empty string with only uppercase letters.

Specification

The str class would start to accept a generic type with the regex representation of the expected pattern:

def func(value: str[r"^[0-9]{11}$"]) -> bool: ...

In the example above, the value parameter would expect any string value that matches the ^[0-9]{11}$ regex expression.

PS: I’ll still implement it to evaluate possible downsides

This idea has already been proposed previously at Generic strings/regex patterns in strings · Issue #1202 · python/typing · GitHub. I’d recommend upvoting that proposal and adding your thoughts to that thread, as it’s probably best to keep discussion there, rather than starting a new thread here :slight_smile:

Writing str[...] in the type hint, as opposed to Literal etc., implies that the function should accept values that weren’t pre-determined and written literally into the code. However, the point of typing is to do checks ahead of time, which isn’t possible with data that’s only read at runtime. It’s hard for me to understand what the use case is here.