Clarification for PEP 604: is `foo: int | None` to replace all use of `foo: Optional[int]`

pf_moore · May 21, 2023, 10:44pm

So in that case, I’d use int | None. I wasn’t trying to say that it never has a use.

Although I think you messed up your example, it doesn’t make sense to me. The type of _usage_stats is clearly collections.Counter, not the same as end. So I don’t follow what your example is trying to show, but I think I get what you wanted to say, hence my response above.

Rosuav · May 21, 2023, 10:59pm

My bad; I wasn’t too clear here. In this case, _usage_stats could be more narrowly typed as Counter[int | None, int] to reflect the fact that its keys will all be integers or None, and (as with counters in general) its values will all be integers. So the question is, should this be written as:

_usage_stats: Counter[Optional[int], int]
def g(stuff: Sequence, end: Optional[int] = None):

or as

_usage_stats: Counter[int | None, int]
def g(stuff: Sequence, end: Optional[int] = None):

or as

_usage_stats: Counter[int | None, int]
def g(stuff: Sequence, end: int | None = None):

? The first one uses Optional in a place where it doesn’t really make sense; the second uses a value unchanged, but now redefines its type; the third just avoids Optional altogether. I’m inclined to the third, since it’s consistent.

mwichmann · May 21, 2023, 11:01pm

Interesting. There seem to be cases when None isn’t great as a
sentinel, which I thought the PEP did a somewhat reasonable job of
describing… perhaps this is just me being naive because I’ve had to
wrestle with some of those issues.

sirosen · May 22, 2023, 3:37am

Isn’t this a separate (and, IMHO, super interesting) topic?
If type annotations are published, then they become a way in which your package documents its APIs.

Documenting that f(int) is valid, that f() is valid, and that no other usages of f are valid is pretty tricky to do today by way of type annotations. None’s special status as everyone’s favorite sentinel value is only related in that it’s common – typing-wise it’s not very special. PEP 661 sentinels, enums, or any other style of sentinel have the same issue.

If I write

class _Missing(): pass
M = _Missing()
def f(x: int | _Missing = M): ...

then most reasonable readers of f’s signature and docs will understand that | _Missing is an artifact of my implementation.

But does it need to be this way? Perhaps typing could someday add a way to express “true” optionality. (And the hair raising naming debate for that construct can begin… )

layday · May 22, 2023, 7:10am

Aside: AFAICT, Pyright accepts that because it specialises g - when assigning f to g, g takes on the type of f and the annotation is ignored. Might be worth opening an issue on their tracker about this, even if it’s just to verify that it’s intended behaviour?

hauntsaninja · May 22, 2023, 7:35am

Some quick notes:

Philosophy of whether default values should be part of documented interface reminds me of:
Should `None` defaults for optional arguments be discouraged? and Signatures, a call to action
It is true that there isn’t a particularly ergonomic way to type a function in two different ways, for its documented interface and for its internal use. Some options here include: using @overload to describe your interface, using two separate functions with two type signatures, using # type: ignore. Like Jelle says, I’d welcome more discussion on whether it’s worth finding better typing ergonomics here.
On Oscar’s comment about making it easier to type Callable with defaulted arguments. PEP 677 extended syntax is what we had in mind here: PEP 677 – Callable Type Syntax | peps.python.org However, PEP 677 was rejected.
On Alex’s comment about a potential deprecation of Union or Optional. I’d go further than “near future” — it would be very disruptive to deprecate and remove these symbols, there would be little benefit, and so I’d vote to not pursue this. However, I strongly recommend the new syntax, since I find it’s less likely to confuse new users and I find it more readable.

EpicWink · May 22, 2023, 11:19pm

There’s a disconnect between the parameter’s accepted typing and the function’s local variable type. Perhaps the following makes more sense:

def foo(bar: int = None):
    if t.cast(int | None, bar) is None:
        bar = 42
    print(bar + 1)

NeilGirdhar · May 22, 2023, 11:21pm

Just want to add in case anyone’s interested that the Ruff tool (through pyupgrade) will convert Optional[X] to X | None for you.

Rosuav · May 22, 2023, 11:25pm

I thought that special-casing was being removed?

gpshead · May 22, 2023, 11:31pm

That brings the thread full circle to the original point that I read more as a question of “why do type checkers feel the need to remove that?”.

I have to agree that despite being shorter than the potentially ambiguous to humans meaning of : Optional[int] = None syntax, : int | None = None is annoying to humans from a repetition point of view. For = None specifically. In the case of other default values I wouldn’t suggest type checkers honor an idiomatic shorthand.

CAM-Gerlach · May 22, 2023, 11:34pm

Right, though unless the type checker is inferring the actual type of bar as int | None (in which case the cast would be redundant), it would either warn on the default value not matching the declared type of bar, and/or treat int as the local type and not warn if you don’t check for None before using the value as an int, which is a very common error that type checkers otherwise protect you from.

pf_moore · May 23, 2023, 8:22am

That’s disappointing Hopefully it’s configurable to turn this off.

CAM-Gerlach · May 23, 2023, 9:39am

Technically, pyupgrade is “configurable” in the sense that you can pass the minimum Python version you want to support, and it won’t upgrade things beyond what is supported in that version. In this particular case, in addition to being enabled when --py310-plus is passed, like other typing rewrites it is also enabled when from __future__ import annotations is imported, unless --keep-runtime-typing is passed (which disables the other typing rewrites too in that case, unless the minimum version for them is passed).

However, there’s no way to disable individual fixers, and the maintainer (in his typical fashion) curtly dismissed any consideration of such and immediately locked the issue ^[1].

And stated that anyone who disagrees should fork the project, though I’m not sure that’s a viable course of action given the legal threats, smear campaign and abuse from his fans that he directed against a group of FOSS maintainers, including at least one current Python core dev, who attempted to fork another one of his projects for similar reasons. ↩︎

pf_moore · May 23, 2023, 10:02am

Luckily, it seems like ruff re-implemented the functionality rather than calling pyupgrade (at least as far as I can tell from the tracker item covering the functionality). So I guess the question is then whether ruff allows disabling of that rule. In general, ruff seems reasonably configurable, but I haven’t looked into it that much yet. I can simply not use pyupgrade, but ruff’s “do all the things” approach, while attractive (particularly in conjunction with its “so fast you didn’t know it ran” appeal ) does mean that there’s a risk of having to accept stuff you don’t want, just to get stuff you do…

And while this seems off-topic, it’s a direct example of the fear I mentioned above:

Although I’m getting the impression that “int | None = None is preferred” could easily drift into “is recommended” and from there to “is the correct way”, and then to being enforced by linters…

CAM-Gerlach · May 23, 2023, 10:29am

Ah, yup, thanks—my comment was focused on actual Pyupgrade (which I’m familiar with), rather than Ruff’s re-implementation of it (which I’m not, beyond the fact that it is indeed a re-implementation following Ruff’s core design goals). I meant to include a disclaimer of that, but it seems I forgot to mention it, oops—my mistake; thanks again for the catch.

Yup, it does—every check is individually disablable via a standard interface. Its basically like Flake8 in that regard, just with a lot more tools by default and re-written in Rust for speed.

I’ve been rather skeptical too, as I really wasn’t sold on its monolithic approach either and the barrier of learning Rust to contribute checks for the sake of saving a handful of seconds running pre-commit/lint checks. However, the broad and increasingly growing check coverage, fine grained and standardized enable/disable and being run by a team of maintainers who aren’t one of the most toxic people in the Python community are starting to warm me up to it. Though I’d really like to see at least baseline wemake support first. But I digress, sorry.

I guess I can see the argument (even if I personally find the reasoning of the PEP and others as to X | None being preferable to Optional[X] quite compelling), though I would view this as less the fault of the suggestion in the PEP and more the product of the arbitrary capriciousness of a certain maintainer widely known for such.

pf_moore · May 23, 2023, 10:38am

I’m not skeptical of ruff - quite the opposite, the idea of just having to deal with one linter is very attractive to me. It’s just that it does so much that it’s daunting to get started, and it’s quite hard to work out how to integrate it with tools like VS Code which has a bunch of lint-y stuff enabled by default, that I’d want to disable in favour of ruff. The usual case of not enough time to switch from “it’ll do” to “what I want”…

To an extent, yes. But there’s a community move towards “opinionated tools” (hello, black ) of which this is really just an extreme example.

Anyway, we are rather off-topic now…

CAM-Gerlach · May 23, 2023, 11:17am

I see; thanks for elaborating on your perspective. Sorry if I got us more off-topic, but just to mention one thing to

Perhaps, but just to note in Black’s case being opinionated and minimally configurable is intrinsically tied to its core purpose and directly motivated by its particular use case (a formatter designed to avoid bikeshedding over code style), whereas in this case it is simply down to the personality of the particular maintainer involved (which, at least in my personal experience, is as extreme as it gets for a well-known maintainer in the Python community—at least, I really hope not too many others are taking a similarly out-there stance without good reason).

cben · May 23, 2023, 11:31am

I’m not sure deceptive vs. truth is the relevant point. f as written does accept None, but could be implemented otherwise. Definition of “public API” and “breaking change” vary by project, but Paul’s stance that “public == explicitly documented” is a valid one.

But you gave good examples for 2 major reasons APIs with a publicly known sentinel are friendlier than APIs requiring real omission

Writing wrapper functions that merely pass on arguments. (It’s always possible to use *args, **kw to accurately pass-through distinction between any specific value vs. not passing any value at all, but it get clumsy fast…)
Unions of value | sentinel are easier to put in collections than real omission.

h-vetinari · May 23, 2023, 10:49pm

I was hesitating to mention the following because I know the appetite for short-hand syntax in Python is extremely low. Though the discussion here kind of shows an impasse with both Optional[int] = None and int | None = None being suboptimal in some ways, and no perspective for how to get from the status quo to something like “There should be one – and preferably only one – obvious way to do it.”

I think missing values in particular are such a fundamental concept that they might deserve more explicit language support. This is seen in how painful missing-data support is in various low-level libraries^[1], but also in how many other languages are striving to solve the missing data problem (e.g. Rust’s Option, C++ with optional, scala etc.)

Although this is but a facet of what comprehensive treatment of missing values on the language level might look like, my thought in this discussion was to give X | None its own syntax, e.g. X?. That way, the signature would look like:

def foo(bar: int? = None):

# instead of
def foo(bar: Optional[int] = None):
def foo(bar: int | None = None):

Obviously there would be other syntactic choices that could be made, but I like X? because it expresses the “maybe an X, maybe not (=missing)”, and so becomes less “magical” and verges more towards pseudo-code again (though that’s obviously subjective and I expect opinions to be sharply divided on this).

e.g. numpy only supports a float np.nan, leading to pandas to re-implement its own masking mechanism to deal with missing integers after years of user complaints that a single missing value somewhere will turn a column of integers to floats. ↩︎

carljm · May 23, 2023, 11:02pm

This is PEP 645. Last it was discussed on typing-sig, the consensus was that the value it delivers was insufficient to justify new syntax (and occupying a currently-unused sigil.) So it was never submitted for SC consideration.