A high-performance solution to the "are all elements of an iterable equal" problem

hypernova · February 25, 2024, 1:42pm

This appears to be a relatively common problem to which there isn’t currently a de-facto solution. (At least as far as I know.)

Part of the reason why I believe this at least a fairly frequently seen issue is evidenced by this Stack Overflow question, where there are a huge volume of answers.

Feature or enhancement

Proposal:

def all_equal(iterable):
    return all(map(operator.eq, iterable[1:], iterable[:-1])

Consider the following requirement for a function or operation:

A function which takes as input an iterable and checks to see if all elements are equal

This is a fairly common requirement, but Python does not have a builtin function to support this. There are many ways to calculate the result using Python code, however for performance reasons a lazy-evaluated or short-circuit return builtin would be useful.

On the other hand, this operation can be composed from less complex operations.

A “pairwise” version of map which rather than applying a unary function to an iterable applies a binary function to an iterable
The existing builtin function all, which is a short circuit function (returns early if condition not met)

For this to be sensible, map would have to be lazy evaluated. I believe this is the case, please do correct me if I am wrong.

I would be interested in doing the work for this assuming that the community considers it to be a useful addition to the Python core library.

Could I get some initial feedback on whether or not adding something like this would be considered useful?

I’m not totally sure exactly how it should be done. If a map_pairwise was introduced, then why not higher-order versions which apply functions consisting of 3 arguments, or more?

I suppose one arguments against having a version which applies a trinary function, function of 4 arguments, etc, could be how to do the stride? For unary and binary it is obvious. Unary functions are applied to each element. A binary function is applied to every pair of elements. But then what do you do with a function which takes three arguments?

FelixFourcolor · February 25, 2024, 2:14pm

I’m kinda confused. So what’s wrong with

?

And yes, map is lazy.

apalala · February 25, 2024, 2:22pm

The use of indexing in your example asumes that the “iterables” are lists.

Please take a look at this SO article and the ones linked from it:

hypernova · February 25, 2024, 3:03pm

Well, it’s slow. That isn’t necessarily an issue, but what is an issue is that because it is slow, people are encouraged to write code which is hard to read in order to “optimize” it.

In other words, programmers will start to do highly un-Pythonic things in order to squeeze a bit of performance out of their code. (Which might be 4-5 x speedup, so not negligible.)

Now this leads to detailed debates with questions being asked like

if you code in Python is too slow, should you be using another language rather than trying to optimize it at the cost of having a hard to read code
since premature optimization is the root of all evil, is optimizing even necessary or wise?

Let’s ignore all of that. The fact remains (again, see the linked Stack Overflow question for evidence of this) that there isn’t a “canonical” way of implementing this operation, but it is something that is at least somewhat frequently desired.

At the moment the choices in Python are

write something which is readable but slower than “optimized” solutions
use an “optimized” solution which is unreadable

This is because we don’t currently have a way of doing a map operation with a binary function.

By the way, my Python example code is just to show what operation is being performed. It also is not a “good, Pythonic” solution - the reason being that to make it work I had to reference the iterated over object twice (bad) and perform arbitrary index manipulation (very bad).

A better solution might look like:

all(starmap(operator.eq, pairwise(iterable)))

This makes use of some non-core things, like itertools.pairwise. Again, this is very slow. I don’t personally like the requirement to use starmap to perform the massaging step required to re-organize the arguments to all.

hypernova · February 25, 2024, 3:04pm

Just to say I think you missed the point here. I am not suggesting

map(operator.eq, iterable[1:], iterable[:-1])

is an example of good code - far from it in fact.

By the way, was my post moved or something? I am not familiar with this kind of forum, but it looks like the original post made has been attached to some longer running thread from Feb 20th.

Raymond, I have read through your answer and would suggest that the existence of both the twitter thread and your answer indicate a missing feature from the Python ecosystem as it currently stands. Allow me to explain -

(This is not a criticism of your work to collate these things by the way.)

If we look at each of these options, ask yourself, which of them would you be happy to see show up in a code review if it were sent to you?

I would suggest the answer is none. In each case there is an argument to be made that the solution is either hard to understand or at the very least not immediately obvious what it does (and therefore requires thinking about - which is never a good thing). OR the runtime is likely to be slow. Many of the examples do not have an early termination condition.

A good Python code is obvious and self-explanatory.

The solution I used is also unsatisfactory for the same reasons - I already provided criticism of it as did someone else on this thread.

Herein lies the problem: None of the proposed answers in the entire Stack Overflow thread actually say literally what they are doing.

Let me explain -

What did we want to do?

We wanted to test to see if all objects returned by iterating over an iterator have the same value according to operator.eq.

Which of the proposed solution says this?

None of them.

Let’s take my example, and criticize it:

What does the following code say that it does?

all(map(operator.eq, iterable[1:], iterable[:-1]))

It says this:

create an iterator from the second element of the input to the last element of the input
and create another iterator from the first element of the input to one element before the last element of the input
and apply operator.eq to the values yielded by these iterators
and test if all the resulting values returned by operator.eq are True

Is that anything like our original problem statement of “are all elements equal”? Not really.

And this is the point: The code I have written does not clearly say what it does.

This is why I think we need some other solution, perhaps in the form of a map_binary feature. (Or perhaps something else.)

alicederyn · February 25, 2024, 3:30pm

Also your solution assumes the iterable can be sliced.

hypernova · February 25, 2024, 3:40pm

Yes, this point has already been made. Please don’t fixate on this, it’s not relevant to the point being made.

apalala · February 25, 2024, 3:41pm

def all_equal(iterable):
    it = iter(iterable)
    return all(a == b for a, b in zip(it, it))

EDIT:

Sorry about this!

hypernova · February 25, 2024, 3:44pm

def all_equal(iterable):
    it = iter(iterable)
    return all(a == b for a, b in zip(it, it))
    
print(all_equal([1, 1, 2, 2]))
# prints True

Let me re-focus the discussion so that it doesn’t become side tracked by a stream of suggestions about how to implement this using existing Python constructs.

The important question here surely is this:

is the community satisfied with the existing methods of implementing this simple operation, or not?

apalala · February 25, 2024, 3:54pm

My opinion is that there should be a recipe that is both clear and efficient.

def all_equal(iterable):
    it = iter(iterable)
    x = next(it)
    return all(a == x for a in it)

alicederyn · February 25, 2024, 3:55pm

I don’t think the solutions using itertools are unreadable, and I also don’t care so much if they are. If the method name is clear, a fast but subtle implementation is fine, just document it carefully.

I for one am. If a general-purpose function that performed as fast as an existing solution for this problem and had other real-world uses were proposed, that would be interesting, but making one that realistically is only useful for all_equal does not seem valuable.

alicederyn · February 25, 2024, 3:58pm

You need to try/except this to handle the empty iterable case and that’s no longer fast. Also, I think doing the equality operation in Python is suboptimal compared to the itertools function which does the loop and comparison in C.

hypernova · February 25, 2024, 4:15pm

Can you show us the itertools solution so that we can understand what you are thinking here?

sirosen · February 25, 2024, 4:18pm

Whichever one is most appropriate to the context.
Which is why the current docs are fine, IMO.

I wouldn’t mind if another example were added, but the current one should not be removed.

hypernova · February 25, 2024, 4:21pm

Ah, I really wish my OP hadn’t been moved to this thread because there’s now two conversations going on about unrelated things.

some question about docs
my question about where best to add a new map_binary function, and a discussion about whether or not this should be added or something else added instead (or even nothing at all)

apalala · February 25, 2024, 4:25pm

It would be one try/except per call, not per element, so the overhead is small, more so for large sequences, and it doesn’t make sense to seek for optimizations unless the sequences are large.

pf_moore · February 25, 2024, 4:45pm

You don’t need a map_binary, as map already does this just fine…

hypernova · February 25, 2024, 4:47pm

Can you show us how?

pf_moore · February 25, 2024, 4:49pm

How what, exactly?

Python 3.12.0 (tags/v3.12.0:0fb18b0, Oct  2 2023, 13:03:39) [MSC v.1935 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> a = [1] * 20
>>> all(map(lambda x,y: x == y, a[1:], a[:-1]))
True
>>>

hypernova · February 25, 2024, 4:50pm

As per my original post (which was moved)

# does not compile
map(operator.eq, iterable)

Possible proposal:

# would compile, if added
map_binary(operator.eq, iterable)

Topic		Replies	Views
`itertools.all_equal` Ideas	25	1165	March 10, 2024
Improving the all_equal recipe in itertools doc Documentation documentation	32	1142	February 28, 2024
all() function not working as it should. Python Help help	5	297	December 21, 2023
The map() function or the list comprehension? Python Help	5	2222	September 13, 2023
Map not working Python Help	7	812	December 19, 2023

A high-performance solution to the "are all elements of an iterable equal" problem

Feature or enhancement

Proposal:

Related Topics