A high-performance solution to the "are all elements of an iterable equal" problem

This appears to be a relatively common problem to which there isn’t currently a de-facto solution. (At least as far as I know.)

Part of the reason why I believe this at least a fairly frequently seen issue is evidenced by this Stack Overflow question, where there are a huge volume of answers.

Feature or enhancement

Proposal:

def all_equal(iterable):
    return all(map(operator.eq, iterable[1:], iterable[:-1])

Consider the following requirement for a function or operation:

  • A function which takes as input an iterable and checks to see if all elements are equal

This is a fairly common requirement, but Python does not have a builtin function to support this. There are many ways to calculate the result using Python code, however for performance reasons a lazy-evaluated or short-circuit return builtin would be useful.

On the other hand, this operation can be composed from less complex operations.

  • A “pairwise” version of map which rather than applying a unary function to an iterable applies a binary function to an iterable
  • The existing builtin function all, which is a short circuit function (returns early if condition not met)

For this to be sensible, map would have to be lazy evaluated. I believe this is the case, please do correct me if I am wrong.

I would be interested in doing the work for this assuming that the community considers it to be a useful addition to the Python core library.

Could I get some initial feedback on whether or not adding something like this would be considered useful?

I’m not totally sure exactly how it should be done. If a map_pairwise was introduced, then why not higher-order versions which apply functions consisting of 3 arguments, or more?

I suppose one arguments against having a version which applies a trinary function, function of 4 arguments, etc, could be how to do the stride? For unary and binary it is obvious. Unary functions are applied to each element. A binary function is applied to every pair of elements. But then what do you do with a function which takes three arguments?

I’m kinda confused. So what’s wrong with

?

And yes, map is lazy.

The use of indexing in your example asumes that the “iterables” are lists.

Please take a look at this SO article and the ones linked from it:

1 Like

Well, it’s slow. That isn’t necessarily an issue, but what is an issue is that because it is slow, people are encouraged to write code which is hard to read in order to “optimize” it.

In other words, programmers will start to do highly un-Pythonic things in order to squeeze a bit of performance out of their code. (Which might be 4-5 x speedup, so not negligible.)

Now this leads to detailed debates with questions being asked like

  • if you code in Python is too slow, should you be using another language rather than trying to optimize it at the cost of having a hard to read code
  • since premature optimization is the root of all evil, is optimizing even necessary or wise?

Let’s ignore all of that. The fact remains (again, see the linked Stack Overflow question for evidence of this) that there isn’t a “canonical” way of implementing this operation, but it is something that is at least somewhat frequently desired.

At the moment the choices in Python are

  • write something which is readable but slower than “optimized” solutions
  • use an “optimized” solution which is unreadable

This is because we don’t currently have a way of doing a map operation with a binary function.

By the way, my Python example code is just to show what operation is being performed. It also is not a “good, Pythonic” solution - the reason being that to make it work I had to reference the iterated over object twice (bad) and perform arbitrary index manipulation (very bad).

A better solution might look like:

all(starmap(operator.eq, pairwise(iterable)))

This makes use of some non-core things, like itertools.pairwise. Again, this is very slow. I don’t personally like the requirement to use starmap to perform the massaging step required to re-organize the arguments to all.

1 Like

Just to say I think you missed the point here. I am not suggesting

map(operator.eq, iterable[1:], iterable[:-1])

is an example of good code - far from it in fact.

By the way, was my post moved or something? I am not familiar with this kind of forum, but it looks like the original post made has been attached to some longer running thread from Feb 20th.

Raymond, I have read through your answer and would suggest that the existence of both the twitter thread and your answer indicate a missing feature from the Python ecosystem as it currently stands. Allow me to explain -

(This is not a criticism of your work to collate these things by the way.)

If we look at each of these options, ask yourself, which of them would you be happy to see show up in a code review if it were sent to you?

I would suggest the answer is none. In each case there is an argument to be made that the solution is either hard to understand or at the very least not immediately obvious what it does (and therefore requires thinking about - which is never a good thing). OR the runtime is likely to be slow. Many of the examples do not have an early termination condition.

A good Python code is obvious and self-explanatory.

The solution I used is also unsatisfactory for the same reasons - I already provided criticism of it as did someone else on this thread.

Herein lies the problem: None of the proposed answers in the entire Stack Overflow thread actually say literally what they are doing.

Let me explain -

What did we want to do?

  • We wanted to test to see if all objects returned by iterating over an iterator have the same value according to operator.eq.

Which of the proposed solution says this?

  • None of them.

Let’s take my example, and criticize it:

What does the following code say that it does?

all(map(operator.eq, iterable[1:], iterable[:-1]))

It says this:

  • create an iterator from the second element of the input to the last element of the input
  • and create another iterator from the first element of the input to one element before the last element of the input
  • and apply operator.eq to the values yielded by these iterators
  • and test if all the resulting values returned by operator.eq are True

Is that anything like our original problem statement of “are all elements equal”? Not really.

And this is the point: The code I have written does not clearly say what it does.

This is why I think we need some other solution, perhaps in the form of a map_binary feature. (Or perhaps something else.)

1 Like

Also your solution assumes the iterable can be sliced.

Yes, this point has already been made. Please don’t fixate on this, it’s not relevant to the point being made.

def all_equal(iterable):
    it = iter(iterable)
    return all(a == b for a, b in zip(it, it))

EDIT:

Sorry about this! :slight_smile:

def all_equal(iterable):
    it = iter(iterable)
    return all(a == b for a, b in zip(it, it))
    
print(all_equal([1, 1, 2, 2]))
# prints True

Let me re-focus the discussion so that it doesn’t become side tracked by a stream of suggestions about how to implement this using existing Python constructs.

The important question here surely is this:

  • is the community satisfied with the existing methods of implementing this simple operation, or not?
1 Like

My opinion is that there should be a recipe that is both clear and efficient.

def all_equal(iterable):
    it = iter(iterable)
    x = next(it)
    return all(a == x for a in it)
2 Likes

I don’t think the solutions using itertools are unreadable, and I also don’t care so much if they are. If the method name is clear, a fast but subtle implementation is fine, just document it carefully.

I for one am. If a general-purpose function that performed as fast as an existing solution for this problem and had other real-world uses were proposed, that would be interesting, but making one that realistically is only useful for all_equal does not seem valuable.

2 Likes

You need to try/except this to handle the empty iterable case and that’s no longer fast. Also, I think doing the equality operation in Python is suboptimal compared to the itertools function which does the loop and comparison in C.

Can you show us the itertools solution so that we can understand what you are thinking here?

Whichever one is most appropriate to the context.
Which is why the current docs are fine, IMO.

I wouldn’t mind if another example were added, but the current one should not be removed.

Ah, I really wish my OP hadn’t been moved to this thread because there’s now two conversations going on about unrelated things.

  • some question about docs
  • my question about where best to add a new map_binary function, and a discussion about whether or not this should be added or something else added instead (or even nothing at all)
1 Like

It would be one try/except per call, not per element, so the overhead is small, more so for large sequences, and it doesn’t make sense to seek for optimizations unless the sequences are large.

You don’t need a map_binary, as map already does this just fine…

Can you show us how?

How what, exactly?

Python 3.12.0 (tags/v3.12.0:0fb18b0, Oct  2 2023, 13:03:39) [MSC v.1935 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> a = [1] * 20
>>> all(map(lambda x,y: x == y, a[1:], a[:-1]))
True
>>>

As per my original post (which was moved)

# does not compile
map(operator.eq, iterable)

Possible proposal:

# would compile, if added
map_binary(operator.eq, iterable)