Allow zero arguments for subscript syntax

Nineteendo · May 11, 2024, 7:56am

Could we allow calling X.__getitem__() or X.__class_getitem__() for X[]?
This allows to do something special in this case.

At least the current SyntaxError for X[] isn’t very clear:

>>> tuple[]
  File "<python-input-1>", line 1
    tuple[]
          ^
SyntaxError: invalid syntax

Previous discussion: Default argument for `__getitem__()` · Issue #118937 · python/cpython · GitHub

Nineteendo · May 11, 2024, 7:59am

Copying the response from Jelle here:

That would be a change to the grammar, so definitely not a minor feature that we can accept without prior discussion. In fact, it would probably need a PEP.

Your link shows 30k uses of [()]. That’s a lot, but in Implicit tuple return type - #61 by Nineteendo you found more than 30 million uses of x[a, b]—indicating that the x[()] syntax is not especially common. I glanced through the first few results; some are uses of the empty tuple type tuple[()], others are in various array or tensor libraries. I can’t speak for the latter, but from a typing perspective I think tuple[()] is a clearer spelling for the empty tuple type than tuple[].

Rosuav · May 11, 2024, 8:48am

Why should this one protocol get a default argument? And if it gets a default argument, why this default? Why not zero, or None, or any other plausible value?

And as to improving the error message: Remember that tuple is not a keyword. Whatever change you propose has to make just as much sense for all of these, which are equivalent as far as the parser’s concerned:

tuple[]
int[]
[][]
print("Hello, world")[]
42[]
stuff[]

I understand that a basic “invalid syntax” isn’t very helpful, but it’s also not wrong, and any proposed change has to be careful to avoid being incorrect.

pf_moore · May 11, 2024, 10:28am

You didn’t include a “do nothing” option, which is what I would have voted for. You’re starting from an assumption that there is a problem worth solving here, but I don’t think that has been established yet.

Yes, improved error messages are always good, but there’s nothing about this situation that is any more deserving of an improved message than many other places - and, as @Rosuav pointed out, improving error messages is often a lot harder than it seems at first glance.

My vote would be “do nothing, and assume that any improvement in the error message will happen in due course as part of the wider ongoing work to improve error reporting”.

Nineteendo · May 11, 2024, 10:57am

It’s not documented that tuple[int, int] is actually tuple[(int, int)], only that it accepts any number of type arguments. So it’s confusing why tuple[] isn’t allowed and you need to use tuple[()].

tuple[()] is 30x as common as tuple[None] and is the more natural thing to replace.

It could be SyntaxError: missing argument, but you can get better error messages at runtime:

tuple[]  # OK
int[]  # TypeError: type 'int' is not subscriptable
[][]  # TypeError: missing argument
print("Hello, world")[]  # TypeError: 'NoneType' object is not subscriptable
42[]  # TypeError: 'int' object is not subscriptable
stuff[]  # NameError: name 'stuff' is not defined

I replaced the poll (editing it is not possible).

Rosuav · May 11, 2024, 11:08am

But what if it isn’t tuple that’s being subscripted? What if it’s a list or dictionary - how often do you subscript those with empty tuples? Lists are frequently subscripted with zero. Dictionaries might be subscripted with pretty much any hashable value, but strings and numbers are certainly fairly common. The empty tuple is an extremely uncommon subscript, in the scheme of things. It would make a very odd choice for the default.

I agree with Paul that there is no proof yet that there is a problem to be solved.

mikeshardmind · May 11, 2024, 11:13am

There’s another notion of default that this conflicts with as well. Unsubscripted types are generally interpreted in their most permissive form (though with typevar defaults having been recently accepted, user defined generics can have behavior like this beyond what is cased for the builtin collections), for Tuple, that’s tuple[Any, ...], for list that’s list[Any], I’d rather leave tuple[] invalid, there’s enough varying ways to spell things as it is, and in the face of something which could be ambiguous that isn’t allowed, there’s not a good reason to introduce new places for people to get things mixed up.

Nineteendo · May 11, 2024, 11:36am

I haven’t said anything about adding a default there, and I don’t know if anything other than this makes sense:

User defined classes can define their own behaviour if desired.

Let’s wait for the result of the poll. 2 votes tells me nothing.

It’s not possible to figure out how to type hint the empty tuple without reading the docs if you have only seen tuples with 1 or more arguments:

typing module: empty tuple syntax is undocumented · Issue #81995 · python/cpython · GitHub
Is there a way to represent an empty tuple type? · Issue #4211 · python/mypy · GitHub

pf_moore · May 11, 2024, 12:14pm

You replaced it with something that still doesn’t allow me to express my view as stated above. Don’t bother trying to change it again, though - I’ve said all I wanted, I simply won’t vote (but please remember when interpreting the vote results to allow for people like me who didn’t vote to express a “none of the above” sentiment^[1]…)

Any number of votes in a poll that doesn’t ask “is there a problem to be solved” won’t answer that question in any case.

and no, if you change the vote again, to add that option, I still won’t vote for it - you’re ignoring my main point which is that there isn’t a problem to solve here, so no vote is needed ↩︎

layday · May 11, 2024, 12:27pm

There’s little benefit to gauging uninformed opinion. A poll should come after a healthy discussion, once the pros and cons of your proposal are analysed and understood.

plannigan · May 11, 2024, 12:32pm

Being a bit pedantic but , “No preference” is different than “Do nothing”. “No preference” implies “I’m fine with either”. Which is not what Paul’s message was suggesting. ^[1]

As for my stance, I’m not sold yet on tuple[]. It might make it fewer characters to say “I want an empty tuple”. But I think some people who see it for the first time might confuse it for “I want a tuple, but I’m not specifying the types of the values it contains”.

I’m also hesitant to make a specific type from the stdlib have a meaning for nothing supplied for subscript, when everything else still requires it. Is there a plan for all of the other collections? What about all of the other generic types?

Also, I looked at the first page of result for the code search. 9 of the 20 results on that page were not about type annotations ^[2].

Apparently Paul types faster than I do. Leaving it here anyways. ↩︎
apparently indexing into some kind of collection with an empty tuple is a somewhat common pattern for scientific libraries ↩︎

Nineteendo · May 11, 2024, 12:35pm

Sorry for being nitpicky (I have autism), but:

“Do nothing” is NOT an answer on “What notation do you prefer for the empty tuple?”.
And “None of the above” would mean that you want yet another notation for the empty tuple.

Should I have asked a different question? Maybe, but you probably won’t vote for that either.

you’re ignoring my main point which is that there isn’t a problem to solve here, so no vote is needed

I would like to disagree here: tuple[()] is confusing if you have only seen tuple[int], tuple[int, int], …

That’s a valid argument, I have closed the poll for the time being.

mikeshardmind · May 11, 2024, 12:44pm

All suggestions to change something have to first overcome two questions:

Is the change reasonable to implement?
Does the change have any downsides, and if so, do they outweigh the benefits.

You’re asking for a grammar change, for a single builtin to have 2 less characters when typing one specific instance of it (the type representing when it is empty), so that’s a pretty large change, and it has side effects that now every place where [] wasn’t valid syntax before might have to handle this differently.

Starting a poll with “which would you prefer” without first addressing the bare minimum for a change first, and leaving this out of the poll presumes that a change would be possible or reasonable, without exploring it. There are many reasons why this one really is not a reasonable change. If you’re a stickler for the answer directly matching the question you ask, then you’re likely going to find all manner of discussions frustrating, because sometimes the best answer people can give is “you skipped a few steps, and we can rule out this change for larger impact”

Rosuav · May 11, 2024, 12:52pm

Perhaps this is an indication that a poll of this nature isn’t really asking the right question, nor is a poll the right way to figure out what matters.

Nineteendo · May 11, 2024, 1:29pm

This is not about typing less characters, that’s just a side effect.

In my humble opinion, tuple[] makes more sense if tuple is documented to accept ANY number of arguments, but it currently doesn’t accept zero.

On top of that we needed to clarify in the docs that you need to use an empty tuple in that case, because it wasn’t intuitive.

It could make sense for Union too, but that’s not very useful as Never already exists. I don’t know about other generics.

I have narrowed the regex to tuple[()] as that’s what this discussion is mostly about.

I would say so, it’s a natural extension of tuple[int] & tuple[int, int], …

This might be confusing for people who are extremely familiar with the current syntax
This doesn’t detect errors at compile time

I think you would quickly get used to it, and you can ignore the fact that you’re not actually passing multiple arguments. In return you get far more helpful error messages at runtime. After all it’s not the second bracket that’s the problem, but BOTH brackets for non subscriptable objects:

Nice Zombies:

tuple[]  # OK
int[]  # TypeError: type 'int' is not subscriptable
[][]  # TypeError: missing argument
print("Hello, world")[]  # TypeError: 'NoneType' object is not subscriptable
42[]  # TypeError: 'int' object is not subscriptable
stuff[]  # NameError: name 'stuff' is not defined

Unless the function already had a default defined, which seems pretty weird to me, you would just get a TypeError like calling a function without providing all required arguments.

Why? Could you elaborate?

I actually find most discussions on this forum frustrating, because the opinion of people seems to be set in stone and there’s nothing I can do to change it.

Note: I have no problem with my idea being rejected, but I would like a compelling reason.
I have given up on suggesting things that could just be put on PyPi.

Is there anything you would approve if I proposed it? Given the maturity of Python, the obvious things have already been implemented or were rejected.

chepner · May 11, 2024, 1:38pm

I’d start with identifying how many of those 5200 files are using tuple[()] where they could just use None instead. The only distinction I can think of between the two is that the empty tuple is iterable, while None is not, but just a quick glance at the results shows at least a few that simply treat the empty tuple as a falsy value without trying to iterate.

mikeshardmind · May 11, 2024, 1:57pm

Already done so above, and while I don’t really care to repeat myself, on this occasion I’ll elaborate.

I don’t care who proposes an idea, I care that the idea solves a problem with a reasonable scope for that problem, has a good path to being supported that doesn’t conflict with other important things, and that changes are generally positive for the language. Some of this is more subjective than others, if you had proposed an idea that fit those subjective criteria, you’d have my support on the idea and no further.

On those specific metrics, this idea fails for me.

I don’t think the Idea solves a real problem. your links above showing people running into this, one was a user that once they had an answer, they said thanks and were personally done. they didn’t know, but they were fine once they did. This resulted in follow-up, including directly documenting this. The other link was to an issue that was resolved by adding docs for this, so if the problem is that you don’t think people can discover this, your own links undermine this.
It’s a grammar change, which makes it very disruptive and large in scope.
and when it comes with conflicting design goals, It creates further places where the type system is less obvious and less internally consistent by creating an uglier situation where you now have a single generic type that is special-cased such that tuple[] is valid, and you’ve also made it not equivalent to just tuple as a type.

plannigan · May 11, 2024, 2:28pm

This sounds to me like better documentation would be an alternate solution to help alleviate the situation you are describing. I’d prefer that solution as it is something that could be contributed today instead of a syntax change that wouldn’t become available to anyone until October 2025 ^[1].

I think it is important to note that the narrowed regex decreased the number of matches from ~30k to ~5k, which is a significant drop. There are people that interact with these discussions based on a mailing list, so editing previous posts makes it harder for those users to follow along. Though I do acknowledge that ~5k is significantly higher than I would have expected given that I had never seen it before. I’m still not why it is being used, but clearly it is being used.

and even longer for code that has to support older Python versions ↩︎

pf_moore · May 11, 2024, 2:39pm

Yes, you are asking the wrong question.

But in addition, as @layday pointed out (and I was trying to say, sorry if I wasn’t clear enough) any poll is premature at this stage, as it’s not clear what the right question is - or even if there is a question to answer at all.

I don’t find tuple[()] confusing in the context of Python’s runtime semantics - it’s passing the empty tuple to the __class_getitem__ method of the tuple class. Note that the defined signature for that method is __class_getitem__(cls, key). The definition of how a subscription expression is translated into the “key” argument is also defined in the language reference.

In particular, note that subscription is defined as

subscription ::= primary "[" expression_list "]"

where “expression_list” is a non-empty list of expressions, separated by commas.

Thus, tuple[] is, according to the language grammar, invalid syntax, and tuple[()] is an expression which returns the value of tuple.__class_getitem__(()). Exactly as you have discovered.

You cannot just propose to change the meaning of tuple[], as that’s inextricably linked with all of the above language features. You would need to extend the definition of subscription to allow an empty subscript, define what the key argument to __class_getitem__ (and __getitem__) would be when an empty subscript is given, and define (and implement!!!) the effect of an empty subscript for every built in and stdlib type that is subscriptable^[1].

Realistically, without massive disruption, the only possible meaning for an empty subscript would be as an alias for an existing value, in effect changing the signature of __class_getitem__ and __getitem__ to (self, key=default_value). And then we need to decide what value should be the default. Remember, it has to be sensible for all subscriptable types, not just tuple. Why () - why not None, which is the usual “missing value” default? Why not have sequence[] mean the same as sequence[:]? There’s no clear answer that’s right for all use cases, and as the Zen says, “in the face of ambiguity, refuse the temptation to guess”.

The amount of work needed to make this change is far more than you seem to think, and would have far-reaching consequences across the Python language ecosystem^[2]. IMO it simply isn’t worth it, given the extremely limited benefits that would be gained.

And given that your issue with the current behaviour is that the error message is bad, I assume you wouldn’t be happy with a generic ValueError exception at runtime replacing the SyntaxError you currently get… ↩︎
libraries like numpy make extensive use of complex indexing expressions - this sort of change could hit them hard ↩︎

Nineteendo · May 11, 2024, 3:23pm

That’s not my problem, my problem is that they NEED to look this up because it isn’t obvious:

Complete this sequence: tuple[int, int], tuple[int].
- Eh, tuple[]?
That would be a syntax error, you obviously need tuple[()].
- Why do I suddenly need to provide a tuple?
You were already providing a tuple: tuple[(int, int)]. And tuple[int] is treated specially.
- Now I’m just confused.

As described above, tuple[()] is already not obvious.

I think the documentation explains this as well as possible without confusing the reader. After all, subscripts only have a single argument. It just so happens that a tuple can be written without parentheses, making it look like a function call.

Even if we explained it, it doesn’t solve the prolem of getting a syntax error when you try to type an empty tuple for the first time. I only solved this by checking the type with my type checker, otherwise I would’ve needed to look this up.

and even longer for code that has to support older Python versions

I already use from __future__ import annotations …

There are some people (including myself), who type everything: GitHub - nineteendo/pyvz2 at alpha

I agree, but we make it look like a function call by ommiting parentheses and putting a single value in a tuple. But this breaks for zero arguments.

Wait, what? I thought we were just passing a tuple? Why does it duplicate the code for tuple construction?

And given that your issue with the current behaviour is that the error message is bad, I assume you wouldn’t be happy with a generic ValueError exception at runtime replacing the SyntaxError you currently get…

Can’t we just call the function with no arguments which gives a TypeError? Or am I missing something obvious?

class Foo:
    def __class_getitem__(cls):
        pass

# TypeError: Foo.__class_getitem__() takes 1 positional argument but 2 were given
Foo[int]
# TypeError: tuple.__class_getitem__() takes exactly one argument (0 given)
tuple.__class_getitem__()