Improving the all_equal recipe in itertools doc

Although it is trivial to count the elements in a lazy iterator, the commonly used idiom using sum just isn’t very immediately readable to those who aren’t familiar with the idiom:

def all_equal(iterable):
    return sum(1 for _ in islice(groupby(iterable), 2))) < 2

So yeah it would be nice to have an itertools function just for counting, and to consume an iterator cheaply.

Easily read as “that there is not any pair of equality groups”. +1

3 Likes

Also as a mild annoyance in this case, it doesn’t short-circuit any more.

Ben had said that for a moment too. Is the islice(..., 2) so easily overlooked?

1 Like

Maybe @alicederyn meant to say that it doesn’t short-circuit anymore when given an empty iterable, like my original argument in the first post. :slight_smile:

Apparently yes, it’s easily overlooked. Oops. I guess “isn’t very immediately readable to those who aren’t familiar with the idiom” can be extended! Sorry…

I am a fan of not having to repeat the 2 inside islice too :sweat_smile:

Though as an example, I suppose the islice version shows off a more general tool that can be used elsewhere more easily than “any pairwise” which only works for “is there two or more”

Agreed, I’ve wanted this in the past. ilen?

Meh. It’s shown in six recipes already. Enough! :slight_smile:

Even like this already, in the very first recipe:

def take(n, iterable):
    "Return first n items of the iterable as a list."
    return list(islice(iterable, n))

So … actually:

def all_equal(iterable):
    return len(take(2, groupby(iterable))) < 2
1 Like

That’s what more-itertools calls it.

And there’s an old issue where it was rejected.

The length and variety of this topic threads highlights an important point: itertools is full of powerful tools the can be combined in many ways, many of which are not obvious at first. As the recipes section says, “The primary purpose of the itertools recipes is educational.”

There are many good points being made here about the pros and cons of each approach, and the behavior of the primitives being used. Since all_equal is a recipe in the docs, not an implementation, why do we need to choose just one? We could expand the recipes from a single code block to readable prose that explains what’s happening in each, to make it more fully pedagogical.

BTW: The recipes section also says, “The recipes also give ideas about ways that the tools can be combined — for example, how compress() and range() can work together,” but compress isn’t mentioned in any of the recipes, so there’s some editing to be done. It looks like we lost the compress/range combination when sieve was updated.

7 Likes

FWIW, this is a canonical use of islice. It says, “fetch no more than two groups.”

It is similar to the standard idiom for sequences: preview = data[:10]

Both groupby() and islice() are being used in the most direct, canonical, and least clever way. It is what we want people to learn.

A core problem being solved is that (aside from Tim Peters, Ben, and Stefan) no one is born knowing how to manipulate iterator streams with an iterator algebra in a functional style. Working through these examples teaches that style of thinking (and a few patterns). In my courses, I’ve had people work through how each example works and have found that it confers Jedi like mastery of the itertools.

4 Likes

I agree, but it’s an extra step in the process that isn’t required to solve the problem - it’s just enabling short-circuiting. Some of the other approaches don’t make this feel like a separate step, while still accomplishing short-circuiting. But either way, examples that purely chain together function calls (including all / any) are probably a better illustration of the power of itertools, than examples that have to rely on boolean operators to combine results. Yes, even though any/all could be described as generalizations of or/and.

I feel like CS courses used to give a better background for this kind of thing. (For example, by expecting students to become familiar with pipelines in Unix commands, and accomplish useful things with them, following “the Unix way”.) But yes, having examples like this is excellent pedagogically.

I wonder if it wouldn’t be better to show multiple examples for all_equal. That “preferably only one obvious way” thing doesn’t seem to work out as often as one might like :wink:

2 Likes

I think having more than one canonical way to do pythonic things is fun to play with, but perhaps not something worth propagating in the docs. Sometimes too many options is confusing and leads one wanting to see less.