Let's talk a bit about reduce (again?)

The Python programming languages contains some features from the functional programming paradigm. Of course, it is not primarily a functional language: many things are simply clearer when written out as a loop. The documentation about statements in lambdas also seems to have a pretty clear opinion on the topic:

Unlike lambda forms in other languages, where they add functionality, Python lambdas are only a shorthand notation if you’re too lazy to define a function.

Nonetheless, Python has some handy functional features. Functions are first-class citizens, we have various comprehensions, and in particular, we have some higher order functions such as map and filter in the builtins.

But not reduce. It appears that it was banished from the builtins to the functools module with the advent of Python 3.0. Guido expresses in the same post that he also wanted to get rid of map and filter, but perhaps due to the fact that those two could be rewritten to be lazily evaluated they were allowed to continue to exist?

I understand that reduce is not that popular. In many cases, code written with a reduce are less obvious than code written with an explicit loop and an accumulator. But this is just an opinion, something concerning esthetics, at the cost of consistency.

All three are higher order functions that work on an iterable. In fact, mathematically speaking, map and filter are special instances of a reduce or fold. So why are map and filter builtins but not reduce?

I’m not necessarily suggesting to move reduce back into the builtins. To me, it feels more important that these functions that logically belong together are also, in fact, implemented together (i.e. live in the same namespace). Whether this is __builtins__ or functools is secondary to me.

But, considering I also believe that the builtins should be kept as tiny as possible, that leaves me with the opinion that map and filter should be moved to the functools module!

(Of course, I’m aware that this will never happen. But I would like to hear your thoughts.)

1 Like

Given you’re aware this won’t happen, I’m moving this to the Help category rather than the Ideas category, since it looks like you’re asking for help understanding why certain decisions were made or why functions are where they are.

One problem is that reduce is 2 functions: reduce2(update, non-empty-iterable) and reduce3(update, start, iterable), with update(current, next) where next comes from the iterable. To mash them into 1 function, the parameters of reduce3 are switched in order. This confuses many people, in particular Guido, then BDFL, as it makes the order of parameters for the update function harder to remember. I was confused also until I remembered that (current, next(iterable)) and (iterable, start==initial-current) were reversed from each other.

Special cases with built-in update/accumulate functions mostly eliminate the confusion. In builtins, enumerate is another lazy version. Sum was added because it was the most common use of reduce and would make reduce less needed.

Builtins has several functions that reduce a sequence to one object: min, max, most/all? of the collections classes, compile (reduce string to ast, with post-reduction transformation), even print and sort (but not implemented as a bubble reduction!).

Itertools still has many functions that can be regarded as lazy reductions of iterables: accumulate (lazy sum), batched, and pairwise (with special rule for 0 or 1 item) are just 3.

Part of the issue, I think, is that reduce, an abstract 2nd order function, is a bit too abstract, in some sense, for some non-mathematicians.

>>>[1,2] + [3, 4]  # Concrete.
[1, 2, 3, 4]
>>> [1,2].__add__([3, 4])  
[1, 2, 3, 4]
>>> sum(([3],[4]), [1,2])  # Partially abstract.
[1, 2, 3, 4]
>>> from functools import reduce
>>> from operator import add
>>> reduce(add, ([3],[4]), [1,2])  # Even more abstract.
[1, 2, 3, 4]

Because they’re useful in a lot of situations, but reduce is extremely situational. I’ve written huge amounts of code that uses map, in a wide variety of languages. I’ve used filter frequently, though not as often as map. But reduce always feels like a worse way to do whatever’s being done. It’s almost never elegant, where both map and filter are frequently elegant.

There’s SOME value in keeping the three of them together, but not all that much.

1 Like

A couple observations here:

  • What’s different about map and filter is that they’re the conceptual building blocks of a comprehension (or generator expression). So users are exposed to that computation model frequently, and they then have access to the builtins for simpler cases: map for a comprehension that only uses a for clause, and filter for one that needs if but uses the identity function for the “map”.

  • reduce basically just gets the last result from an accumulate process.

  • accumulate is basically just map with a stateful predicate - one that is nominally binary, but which both uses and stores back to the left-hand side. “Classically”, that state is implemented using a closure, but the assignment expression operator := gives us a more direct way:

    >>> x = 0; [(x := x + i) for i in range(10)]
    [0, 1, 3, 6, 10, 15, 21, 28, 36, 45]
    
1 Like

If anything, I think map and filter would move to itertools where the similar starmap and filterfalse iterators live. functools seems like a fitting place for reduce, though moreso if you think of reduce as a special case of a more general version that takes a function and returns a function that takes a list. (That is, a call would look like reduce(f)(some_list) rather than reduce(f, some_list).)

2 Likes

Indeed, that is my main question, thank you. If I understood correctly, once upon a time, map, filter and reduce all lived in the same namespace (that is, builtins), but an active decision was made to take them apart.

I’m aware that map and filter are typically more elegant and readable than reduce, but I feel that that alone is not exactly a good reason to move a function to a different module.

I’m not sure if I follow this. Are you talking about the C implementation? The Python function has the optional start argument at the end, as I would expect. Regardless, to me this would sound like a good reason to fix the parameter order of reduce3 instead. I’m familiar with this type of pain though. I have to work with PHP a lot and the inconsistency of parameter order in its builtin functions is just awful.

Yeah, that’s fair enough. I suppose from a really theoretical standpoint, any function that operates on an iterable is a reduction. I suppose you might as well then ask why sum is a builtin but math.prod is not… but you essentially already answered that question: sum is simply used that much more often.

I guess for me that is the most convincing argument for why map and filter live in a different place: they are simply used all that much more (and I’m willing to believe that that is the cause for them to be in builtins rather than an effect ;-)).

And even though perhaps from a certain puristic standpoint the encapsulation or namespacing of these functions is not exactly “consistent” anymore as a result, you could then argue the same about a lot of functions and classes. In that sense, practicality is more important.
With that I believe my question is resolved now.

1 Like

+1 Sadly we have to teach people to think functional in order to force languages to change and in current state most languages highly discourage thinking functional. It is a chicken and an egg problem. We have to find a way to change both at the same time.

:Guido’s best explanation is in a slashdot interview:

There’s much more in the interview about how he sees functional languages in general.

3 Likes

What would the advantage be? Why are you trying to force a change?

Hmm, what is the advantage of democracy? Having a choice. Currently most programming languages and most educational programs force imperative coding style on students and software engineers. To few of us who sees declarative programming as at least equally powerful if not superior, this situation is really depressing.

You HAVE a choice though. map and filter are builtins, and reduce is one import away. And if that’s not good enough, Lisp exists. So you definitely have that choice.

What’s the advantage of forcing languages to change? What change(s) and why? People clearly don’t want those changes, since you’re trying to ALSO force people to change. I’m just not seeing the advantage that you think this would give.

Click the link to the interview with Guido I posted 4 messages up. As @Rosuav said, you can write “functional” code in Python. But, as they didn’t say :wink:, it’s somewhat clumsy, and extremely slow compared to what you can do in Haskell.

Guido’s interview goes into more detail about how fundamentally different design goals drive that. Languages aren’t just collections of random features. The speed of Haskell is due to enormous effort that went into designing a functional language in which nothing but memory use is actually “dynamic”. For example, all types are exactly known at compile-time (despite that you rarely need to declare the types)), there’s no way to “cheat” on that, and the functional nature allows for highly aggressive optimizations at compile-time. None of that will ever be so in Python.

Indeed, while the PyPy implementation of Python generally runs some 5x faster (with high code-dependent variance) than CPython, one of their cautions is to avoid using itertools functions. Because the underlying language is so highly dynamic, they can’t even begin to do the kinds of transformations an “iterator algebra” may seem to suggest should be possible.

Instead they want to see simple, plain, Python-level loops and conditionals, so that PyPy’s style of low-level JIT’ing and inlining has low-level code to work on. Reasoning “in the large” is greatly eased by lack of side effects, but in Python, e.g., you can’t even write x.y without assuming World War III might be a possible side effect of evaluating getattr(x, 'y'):wink:. Or that another instance of x.y on the same line may restore universal peace.

So “sorry, can’t get there from here” is just the truth. You can write in a functional style, but if that’s your true love, you’ll be happier using a language carefully designed for that.

3 Likes