Don't forbid `map(nullary_func)`

Stefan2 · April 15, 2024, 3:13am

Yeah I had even done the same thing, except wih concat instead of add. Odd that you switched that, especially since you called it concatenate

jamestwebber · April 15, 2024, 3:15am

I thought we agreed that everything is about practicality? I don’t care whether zip is consistent with some other code^[1]. Not nearly as much as I care about zip and map being easy to learn and understand for novice and junior programmers, and not requiring them to SIGKILL a python process (or a whole batch of jobs!) because they had a bug that shouldn’t even be possible.

code I hope never to see in the wild, at that ↩︎

takuom · April 15, 2024, 3:27am

Why do you need to care about consistency for practicality?

You can find an answer in the post from which you quoted previously.

takuom · April 15, 2024, 3:29am

Ha ha. I used add since the method said __add__. concat does sound like a better name there. Thanks.

jamestwebber · April 15, 2024, 3:30am

Consistency can help but it isn’t everything, of course. There are lots of factors that go into what is practical. If “practicality” was something we could just calculate from basic principles it wouldn’t be a thing we were concerned about in the first place.

I mean, I don’t really what we’re doing here anymore. It should be clear that none of these changes are happening. I guess the alternative versions could go in a package?

Rosuav · April 15, 2024, 6:19am

If that’s TRULY all there is to it, with no practical benefit whatsoever, then this would make a great white paper for a “perfectly mathematically consistent function library” or something. It doesn’t have any relevance to the real world.

But if there IS some sort of practical benefit, then… what? And you haven’t answered the part about “when would this be the obvious way to spell it”.

franklinvp · April 15, 2024, 11:01am

In the case of zip, a natural thing to do is for zip() to be an error, for doing a minimum of the empty set, or an intersection of an empty collection without a unique choice for what the universe is.

Stefan2 · April 15, 2024, 11:27am

Hmm? What does the explanation have to do practical benefits?

I already mentioned practical benefits. Cases where that would’ve been the obvious solution, including in the enumerate2d example. @takuom has also shown stuff.

It would’ve been the obvious solution for me multiple times and I was always disappointed that it’s forbidden and I had to write a work-around. Maybe it wouldn’t be obvious to you. Maybe just because you’re not used to it. Does it have to be obvious to everybody? I’ve seen plenty of people write for i in range(len(xs)): print(xs[i]) because to them that was the obvious way to loop over a list, not for x in xs: print(x). I’ve seen people reject even trivial list comprehensions because comprehensions weren’t obvious to them. I’ve seen people write i >= 0 and i < n because 0 <= i < n wasn’t obvious to them. Something not being obvious to everybody doesn’t mean it’s not the right way.

I also managed to run a benchmark now, showing a speed benefit (the called function was int and the times are per call and include overhead of islicing the iterator):

  7.4 ± 0.0 ns  map_func
  9.0 ± 0.0 ns  starmap_func
 19.6 ± 0.1 ns  iter_object_func
 19.6 ± 0.1 ns  iter_None_func
 27.7 ± 0.2 ns  loop_func

Python: 3.13.0a6+ (heads/main-dirty:37a4cbd, Apr 15 2024, 10:35:10) [Clang 15.0.7 (Fedora 15.0.7-2.fc37)]

Benchmark script

from itertools import *

def loop_func(func):
    while True:
        yield func()

def starmap_func(func):
    return starmap(func, repeat(()))

def iter_None_func(func):
    return iter(func, None)

def iter_object_func(func):
    return iter(func, object())

def map_func(func):
    return map(func)

funcs = [loop_func, starmap_func, iter_None_func, iter_object_func, map_func]

for f in funcs:
    print(*islice(f(count().__next__), 5))

from timeit import timeit
from statistics import mean, stdev
import sys
import random

times = {f: [] for f in funcs}
def stats(f):
    ts = [t * 1e9 for t in sorted(times[f])[:5]]
    return f'{mean(ts):5.1f} ± {stdev(ts):3.1f} ns '
for _ in range(100):
    random.shuffle(funcs)
    for f in funcs:
        number = 10**5
        t = timeit(lambda: next(islice(f(int), number, None)), number=1) / number
        times[f].append(t)
for f in sorted(funcs, key=stats):
    print(stats(f), f.__name__)

print('\nPython:', sys.version)

Stefan2 · April 15, 2024, 11:48am

As mentioned, doing a minimum doesn’t even work, since not all iterables even have a length.

What alternative universe do you propose? I’d say \mathbb{N} is the obvious one. And you were the one who proposed it. I took that from your comment.

Also, an error isn’t even what’s actually happening and I don’t recall anyone advocating for that.

Rosuav · April 15, 2024, 12:09pm

You responded to my post, and deleted this entire paragraph without bothering to answer it:

Of course, I was acting under the assumption that this, being in the Python Ideas section, was intended to be an improvement to the Python language. A language which, as one of its design goals, aims for practicality. Perhaps I was wrong, and this is nothing more than an intellectual matter, a way for you to show that you know more about set theory than the rest of us? Or IS there an actual practical reason for changing this? I’m still not seeing it.

Please, show me why, under any circumstances, someone would reach for map(func) when thinking “I want an infinite iterable calling the same thing”, when a more obvious spelling would be a simple while loop.

flyinghyrax · April 15, 2024, 12:50pm

This thread may have already outgrown its usefulness, but since I managed to read it all may as well add to the pile.

I see “how lots of other languages do it” immediately dismissed as not a valid reason to constrain map’s behavior. I think it’s actually a quite valid reason, because:

It represents a rough consensus on the semantics of map, and
It shows how programmers learning Python as a second language might expect it to work.

I didn’t learn how to use map in Python - I learned it in Swift, Haskell, and F# (to name a handful). And as far as I can tell, a succinct description of map’s semantics is: “apply this function to all the elements in this sequence.” That’s how I would explain it to any beginner and how I would expect it to work in any language I find map in (up until using map with more general monads/functors - even then the semantics are “apply this function to the wrapped content of this this thing”).

It then seems to me that:

Calling map with no sequences is a nonsensical statement (unless you are using partial application, which Python doesn’t do by default)
Calling map with multiple sequence is not much more than “syntax sugar”/a shortcut for an intermediate call to zip.

Plenty of other languages that don’t support calling map with multiple sequences do have additional functions map2, map3, etc. for the usually rare times you need 2 or more inputs. None of them have a map0, because that doesn’t make any sense.

The fact that map in Python would produce infinite results if not for one “artificial limitation” is literally an implementation detail specific to CPython. And (in my opinion) changing the current behavior as proposed would only add a footgun for new and second-language learners.

Rosuav · April 15, 2024, 1:08pm

Agreed. Lots of languages have some sort of extensions to that, but what you’ve described is the fundamental purpose of map(). Those extensions often are incompatible with each other, but each one has its purpose for being. A quick summary based only on what comes to mind:

JavaScript lets you access the index as well as the value, by accepting a second parameter.
Pike lets you pass additional parameters to your function, so you can map(stuff, func, "foo") and it will call func(stuff[0], "foo") and func(stuff[1], "foo:) etc.
Ruby I haven’t personally mapped arrays in, but it seems to give the index, as does JavaScript.
And Python lets you pass additional arguments (as does Pike), but instead of passing the same argument values, it iterates over them.

All of these are extensions to the fundamental. There is no logical “what if we didn’t have an argument to pass”. In languages where map() is an array method (eg Ruby, JavaScript), that question doesn’t even make sense.

Asking “what happens if map() received no iterables” is NOT the same as asking “what happens if all() receives no values”. It makes perfect logical sense to attempt to call all() or sum() or max() with no values (although in the latter case, there’s no sensible return value, so it’s an error), but to call map() with no iterables is more akin to calling one of those with no iterable, not with an empty one.

But all of this is still very theoretical, and the biggest question still remains: what practical value is there for allowing map(func) with no iterables? When you want an infinite iterable, why reach for map()?

Stefan2 · April 15, 2024, 1:24pm

I didn’t delete anything. I just didn’t quote/answer everything.

That is indeed the intention.

Not at all, and I wasn’t the one who brought that up, and I likely wouldn’t have.

What am I supposed to do when you don’t see what I show?

That’s not even possible for example in my enumerate2d, as a while “loop” is a statement, so I can’t use it in an expression.

Stefan2 · April 15, 2024, 1:39pm

Based on lots of experience I believe it’s a straightforward “implementation detail” likely to happen with any (or at least most?) implementation that doesn’t artificially or unnecessarily prevent it. I’d like to see your implementation (preferably a Python one, much easier) that can handle an arbitrary number of iterables and doesn’t produce infinite results for the no-iterables case. (Not the one using zip, of course, as that does have an artificial prevention).

Stefan2 · April 15, 2024, 1:51pm

Trying to prove myself wrong, here’s a map implementation that truly can’t handle zero iterables:

def map(f, it, *its):
    its = tuple(iter(i) for i in its)
    for x in it:
        args = [x]
        try:
            for i in its:
                args.append(next(i))
        except StopIteration:
            return
        yield f(*args)

But that does prevent the no-iterables case unnecessarily, as it could simply treat the first iterable like any others. Makes it simpler, too:

def map(f, *its):
    its = tuple(iter(i) for i in its)
    while True:
        args = []
        try:
            for i in its:
                args.append(next(i))
        except StopIteration:
            return
        yield f(*args)

And for no-iterables, it is indeed infinite.

Attempt This Online!

mikeshardmind · April 15, 2024, 2:35pm

It’s been pointed out before that this would be a largely breaking change without benefit. map has a documented requirement of at least 1 iterable, and it’s purpose is to map arguments to functions. Changing a documented case that errors to be more permissive is in fact breaking, as people can currently rely on the empty iterable case to error in try/except which is documented.

I’d also like to point out that you got the math justification incorrect.

Functions can have domains, and
set theory is the incorrect mathematical parallel

map, zip, itertools.starmap (This one will help demonstrate something else later) are all tools in iterator algebra that are much more closely related to combinatorics than to set theory. I say closely related rather than modeled on, because the idea of an iterator algebra is something that exists more for modeling computational iteration than combinatorics.

itertools.product does not include an empty set when given an empty iterable, as it isn’t its job to be appropriate to domains of math where one should be included for a cartesian product, it’s meant to be equivalent to writing out the nested iterators.

map can be expressed in terms of zip and itertools.starmap

def map(func, *iterables):
    return itertools.starmap(func, zip(*iterables))

This isn’t accidental, but a consequence of the purpose and design of each of these. Each of these intentionally and explicitly stops on the shortest iterator and requires some underlying iterator to do work. Everything about all 3 of these is designed to be lazy in nature, and do the least amount of work necessary. treating empty as infinite is counter to purpose in multiple regards.

Stefan2 · April 15, 2024, 3:03pm

I’m no native English speaker, but I don’t think you can “point out” something that isn’t true.

How is it incorrect? The bit you said didn’t make that clear.

Where should one be included for the cartesian product? That doesn’t sound right.

That’s not even equivalent to the actual map. And again, the only reason that isn’t infinite is because zip is artificially prevented from doing that, with extra code for the sole purpose of preventing it.

franklinvp · April 15, 2024, 3:26pm

I don’t know if this is what they meant, but product(), produces an empty iterable, unlike \prod_{i\in\emptyset}X_i resulting in \{\emptyset\}, which has one element that is empty.

mikeshardmind · April 15, 2024, 3:27pm

The part of my post immediately after which you excluded from the quote explains how this would be a largely breaking change, as have other people in this thread. Other people in this thread have also shown how this would not be a benefit. It has been pointed out already.

Again, the part immediately after makes it clear, but to state it again: functions can have domains. The domain of map is over a non-empty set of iterables.

The reason it’s prevented is intrinsically related to the purpose of these functions. There isn’t a useful and obvious purpose to turn finite empty input into an infinite iterator for any of these functions. Their purpose is very clear.

Stefan2 · April 15, 2024, 3:34pm

That’s neither what they were talking about (they said “when given an empty iterable”, not “when given no iterables”) nor is it true (it does yield one result, namely the empty tuple, exactly corresponding to \{\emptyset\}).