Let's talk a bit about reduce (again?)

The Python programming languages contains some features from the functional programming paradigm. Of course, it is not primarily a functional language: many things are simply clearer when written out as a loop. The documentation about statements in lambdas also seems to have a pretty clear opinion on the topic:

Unlike lambda forms in other languages, where they add functionality, Python lambdas are only a shorthand notation if you’re too lazy to define a function.

Nonetheless, Python has some handy functional features. Functions are first-class citizens, we have various comprehensions, and in particular, we have some higher order functions such as map and filter in the builtins.

But not reduce. It appears that it was banished from the builtins to the functools module with the advent of Python 3.0. Guido expresses in the same post that he also wanted to get rid of map and filter, but perhaps due to the fact that those two could be rewritten to be lazily evaluated they were allowed to continue to exist?

I understand that reduce is not that popular. In many cases, code written with a reduce are less obvious than code written with an explicit loop and an accumulator. But this is just an opinion, something concerning esthetics, at the cost of consistency.

All three are higher order functions that work on an iterable. In fact, mathematically speaking, map and filter are special instances of a reduce or fold. So why are map and filter builtins but not reduce?

I’m not necessarily suggesting to move reduce back into the builtins. To me, it feels more important that these functions that logically belong together are also, in fact, implemented together (i.e. live in the same namespace). Whether this is __builtins__ or functools is secondary to me.

But, considering I also believe that the builtins should be kept as tiny as possible, that leaves me with the opinion that map and filter should be moved to the functools module!

(Of course, I’m aware that this will never happen. But I would like to hear your thoughts.)

1 Like

Given you’re aware this won’t happen, I’m moving this to the Help category rather than the Ideas category, since it looks like you’re asking for help understanding why certain decisions were made or why functions are where they are.

One problem is that reduce is 2 functions: reduce2(update, non-empty-iterable) and reduce3(update, start, iterable), with update(current, next) where next comes from the iterable. To mash them into 1 function, the parameters of reduce3 are switched in order. This confuses many people, in particular Guido, then BDFL, as it makes the order of parameters for the update function harder to remember. I was confused also until I remembered that (current, next(iterable)) and (iterable, start==initial-current) were reversed from each other.

Special cases with built-in update/accumulate functions mostly eliminate the confusion. In builtins, enumerate is another lazy version. Sum was added because it was the most common use of reduce and would make reduce less needed.

Builtins has several functions that reduce a sequence to one object: min, max, most/all? of the collections classes, compile (reduce string to ast, with post-reduction transformation), even print and sort (but not implemented as a bubble reduction!).

Itertools still has many functions that can be regarded as lazy reductions of iterables: accumulate (lazy sum), batched, and pairwise (with special rule for 0 or 1 item) are just 3.

Part of the issue, I think, is that reduce, an abstract 2nd order function, is a bit too abstract, in some sense, for some non-mathematicians.

>>>[1,2] + [3, 4]  # Concrete.
[1, 2, 3, 4]
>>> [1,2].__add__([3, 4])  
[1, 2, 3, 4]
>>> sum(([3],[4]), [1,2])  # Partially abstract.
[1, 2, 3, 4]
>>> from functools import reduce
>>> from operator import add
>>> reduce(add, ([3],[4]), [1,2])  # Even more abstract.
[1, 2, 3, 4]

Because they’re useful in a lot of situations, but reduce is extremely situational. I’ve written huge amounts of code that uses map, in a wide variety of languages. I’ve used filter frequently, though not as often as map. But reduce always feels like a worse way to do whatever’s being done. It’s almost never elegant, where both map and filter are frequently elegant.

There’s SOME value in keeping the three of them together, but not all that much.

A couple observations here:

  • What’s different about map and filter is that they’re the conceptual building blocks of a comprehension (or generator expression). So users are exposed to that computation model frequently, and they then have access to the builtins for simpler cases: map for a comprehension that only uses a for clause, and filter for one that needs if but uses the identity function for the “map”.

  • reduce basically just gets the last result from an accumulate process.

  • accumulate is basically just map with a stateful predicate - one that is nominally binary, but which both uses and stores back to the left-hand side. “Classically”, that state is implemented using a closure, but the assignment expression operator := gives us a more direct way:

    >>> x = 0; [(x := x + i) for i in range(10)]
    [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]
    
1 Like

If anything, I think map and filter would move to itertools where the similar starmap and filterfalse iterators live. functools seems like a fitting place for reduce, though moreso if you think of reduce as a special case of a more general version that takes a function and returns a function that takes a list. (That is, a call would look like reduce(f)(some_list) rather than reduce(f, some_list).)

1 Like

Indeed, that is my main question, thank you. If I understood correctly, once upon a time, map, filter and reduce all lived in the same namespace (that is, builtins), but an active decision was made to take them apart.

I’m aware that map and filter are typically more elegant and readable than reduce, but I feel that that alone is not exactly a good reason to move a function to a different module.

I’m not sure if I follow this. Are you talking about the C implementation? The Python function has the optional start argument at the end, as I would expect. Regardless, to me this would sound like a good reason to fix the parameter order of reduce3 instead. I’m familiar with this type of pain though. I have to work with PHP a lot and the inconsistency of parameter order in its builtin functions is just awful.

Yeah, that’s fair enough. I suppose from a really theoretical standpoint, any function that operates on an iterable is a reduction. I suppose you might as well then ask why sum is a builtin but math.prod is not… but you essentially already answered that question: sum is simply used that much more often.

I guess for me that is the most convincing argument for why map and filter live in a different place: they are simply used all that much more (and I’m willing to believe that that is the cause for them to be in builtins rather than an effect ;-)).

And even though perhaps from a certain puristic standpoint the encapsulation or namespacing of these functions is not exactly “consistent” anymore as a result, you could then argue the same about a lot of functions and classes. In that sense, practicality is more important.
With that I believe my question is resolved now.