Removing chained comparisons

dg-pb · November 17, 2024, 6:14pm

Introduction

The way things are (and have been for ages):

“Also unlike C, expressions like a < b < c have the interpretation that is conventional in mathematics”

Chained comparisons also include is, is not, in, not in.

There is an issue and PR for dealing with in,not in:

Issue 32055: Reconsider comparison chaining for containment tests - Python tracker
bpo-32055: Raise SyntaxWarning for chained `in` and `not in`. by serhiy-storchaka · Pull Request #4501 · python/cpython · GitHub

However, I think it might be worthwhile considering going further than that in making all operators uniform in design.

I don’t think convenience that it brings justifies complexity that this introduced. Python is a general programming language and although decisions where cost is low can be based on mathematical concepts, this case, IMO, could be a bit too costly.

2 main reasons:

1. Difficulties overloading comparison operators for other uses

E.g.:

class pipe:
    def __init__(self):
        self.funcs = list()
    def __call__(self, val):
        for f in self.funcs:
            val = f(val)
        return val

# One can implement
pipeline = pipe() >> func >> func2
# But not
pipeline = pipe() > func > func2

2. Difficulties in working with ASTs

From my experience, this becomes main mental bottleneck when writing frameworks that aim to make use of complete set of features.

Everything else is simple and straight forward and essential and this is the only part that I ended up scratching my head about.

Final thoughts

I appreciate that this would be a big breaking change and there is little to no chance for this happening in minor increment.

However, in case this is deemed to be “it would be better if it was done in the beginning” would it be worth noting this for Python 4?

Is there a tag in github or similar list?

Even if this is deemed to never happen, it would be useful information to know.

Rosuav · November 17, 2024, 6:22pm

This is a very good feature, not something to be removed. Being able to check if 1 < x < 10 is expressive and clear.

So, it’s harder to abuse comparison operators for things that aren’t comparisons. Can’t really see that as a major use-case.

You have to think about the entire set of comparisons together. They’re not separate operations, they are a single set. Yes, it’s something to get your head around, but it’s not THAT hard.

No.

Nineteendo · November 17, 2024, 6:27pm

There’s no chance for this happening period. It’s such a pain when other languages don’t support it.

Adding a new magic method for the desired behaviour would at least be open to discussion.

hugovk · November 17, 2024, 7:46pm

It would be a huge breaking change: searching for if .* <=? .* <=?.*: found 14,050 matching lines in 1,808 projects.

As @Rosuav mentioned, it’s an expressive and clear feature. Some of those real-life uses:

if any((0xD800 <= ord(ch) <= 0xDFFF) for ch in name):
if 0 < days <= 60 and epoch == WINDOWS_EPOCH:
if (high > low) and not (low <= pos <= high):
if 1 <= new_days <= 27:
elif INT64_MIN <= value < INT64_MAX:
if key == 'yaw' and -180 <= min_val <= max_val <= 180:
if not 0 <= part < 256:
if self.input.LA(1) == 36 or (65 <= self.input.LA(1) <= 90) or self.input.LA(1) == 95 or (97 <= self.input.LA(1) <= 122):

No to both: Python 4 is unlikely to happen, and if it does happen, will be more like 1-to-2 than 2-to-3. So no big breaking changes.

tstefan · November 17, 2024, 7:54pm

The C++ standard committee discussed a proposal to introduce this as well, however its likely that it will not be implemented because of backwards compatibility. The current behavior in C and C++ is considered to be broken, so clang-tidy has a linter to find such cases. Python does the right thing out of the box and you want to change that?

JamesParrott · November 17, 2024, 8:20pm

If you don’t like the feature, that’s fine. Don’t use it.

Why the heck do you want to make a huge breaking change to prevent the entire Python community from also using a well-established feature, whether they like it or not? And have the core devs do the work to enact the changes, maintain them, and take all the flak that comes from making breaking changes?

pf_moore · November 17, 2024, 8:30pm

Precisely. No, just no.

Kwpolska · November 17, 2024, 8:40pm

Difficulties overloading comparison operators for other uses

This is a feature. Overloading operators with custom semantics tends to produce hard-to-understand code or surprising cases. The % operator abused for string formatting will break in "%s" % a if a is a 2-element tuple, but the same case would work just fine with "{}".format(a).

Difficulties in working with ASTs

The primary customer that a programming language should care the most about are the users who write programs to solve real-life problems. Python should not be crippled just to make the lives of compiler/language tool developers easier.

dg-pb · November 17, 2024, 9:23pm

Fully aware of it, thus my carefulness here.

With this I don’t argue.

Did not know C/C++ situation, this is very useful to know.

My intent was more along the lines of getting some perspectives and having a conversation about it.

Exactly! Well, at least this is the the exact question which is at the root of this.

I think “crippled” is a bit too strong word to use here.
I would say “less convenient mental model to the average end user”.

Specifically, "mental model which is based in programming and not mathematics.
It is more consistent, where all operators strictly adhere to infix/postfix/prefix notation.

So it subtracts in one place, but adds to another.

This is somewhere at the extreme of “Special cases aren’t special enough to break the rules.”.
I am just contemplating whether this extreme should deserve a bit more weight in the age of uncontrollable increase of information.

To go back to the question “Who should Python serve? Tool developers or people who write programs to solve real-life problems?”

I don’t have an answer, but I have another question.

“How long until none of the people who write programs to solve real-life problems use Python, but only tools that are made from Python and other similar programming languages?”

da-woods · November 17, 2024, 9:54pm

Not sure I’d agree with that. It’s more that the current behaviour is often not what people intend. But it’s logically consistent with the language rules.

Personally I think this is somewhere where there’s no reason to change either c/c++ or Python (even if they do different things). There’s enough existing code relying on them all that it’s not worth the effort, whichever set of rules you prefer.

tstefan · November 17, 2024, 10:29pm

You cite this if it was my opinion. The checker in clang-tidy exists because there someone thinks that the current behavior is broken, i.e., code like a<b<c is usually wrong.

There’s enough existing code relying on them […]

Concerning C and C++, your claim is not backed up with data. The authors of the proposal wrote that they found 0 instances where comparization chaining was used correctly.

dg-pb · November 17, 2024, 10:47pm

It says: “I consider this “code that deserves to be broken;”"

Not: “current behaviour is broken”.

tstefan · November 17, 2024, 11:03pm

What is “it” in “it says”?

dg-pb · November 17, 2024, 11:15pm

Article that you shared: Chaining Comparisons

DavidCEllis · November 17, 2024, 11:29pm

With the pain that came with the migration from Python 2 to Python 3, I think that many in the community would hope that if there ever is a Python 4^[1] that the biggest breaking change would be for code that assumed that the major version number would always be 3.

I know there are many who would prefer this never be the case, personally I feel like we should probably be on about Python 6 by now. ↩︎

tstefan · November 17, 2024, 11:33pm

Article that you shared: Chaining Comparisons

I commented on the existence of the liner module. I did not cite the article, so the following accusation that I miscited the article is baseless:

It says: “I consider this “code that deserves to be broken;”"

Not: “current behaviour is broken”.

dg-pb · November 17, 2024, 11:39pm

Could you point me to the data which confirms that “someone thinks” that? Who is that someone? And where does that someone say that “current behavior is broken”?

And finally, can you give some evidence that liner module was created specifically because of that?

It is not an accusation just an observation.
It seems that the information that you shared does not align with what you are making out of it.

There is nothing wrong with being wrong from time to time.

tstefan · November 18, 2024, 12:11am

Again, you quote me extremely selectively, just in an attempt to make it look like I was wrong. In full, I wrote:

By correctly citing the full sentence, it would be clear in which sense “the current behavior is broken” is to be understood. Broken is not to be understood as a claim that that the behavior would not be logically consistent with the language rules (as Da Wood suggested). It means that the current language rules are non ergonomic, so the language delivers something the programmers did not expect, for short: it is bugprone. The part “i.e., code like a<b<c is usually wrong” makes this crystal clear. The term “i.e.” here means “this means”.

That code like a<b<c is usually wrong in C/C++ is backed by the C++ -proposal (which you called article) I have referenced: the authors have concluded that it was always wrong in the codebases they have examined. The reason why the linter module “bugprone-chained-comparison” exists, is written in its documentation that I have linked before.

MRAB · November 18, 2024, 12:17am

In Chaining Comparisons (the proposal that was linked to), the author says:

“”"

int x = 1, y = 3, z = 2;
assert (x < y < z); // today, means “if (true < 2)” – succeeds

In this proposal, the meaning of the condition would be if ((1 < 3) && (3 < 2)) and the assertion will fire. To use Stroustrup’s term, I consider this “code that deserves to be broken;” the change in meaning is probably fixing a bug. (Unless of course we do a code search and find examples that are actually intended.)

Non-chained uses such as (a<b == c<d) keep their existing meaning.
“”"

So the proposal is for chained comparisons under certain conditions.

dg-pb · November 18, 2024, 12:22am

So all of what you wrote now seems correct to me and it aligns with everything I read in what you shared.

While your statements before were not semantically in line with this.

As long as we can agree that no one thinks that “current C/C++ behavior is broken”, we are good. The way that you wrote “current C/C++ behavior is broken” meant exactly that, even with full context of what you wrote.

Specifically, the word “broken” was misused by you.

Full sentence of yours:

If it was:

“The current behavior in C and C++ is considered to be bug-prone, so clang-tidy has a linter to find such cases.”

Then it would have been objective conveyance of information without your personal twist.