For any & every

Nineteendo · May 15, 2024, 7:41am

Fine, I reverted the edits in this topic. I hope you’re at least fine with the addition of this:

Oh, I thought you meant it could be hard to implement the full syntax I was describing. So, I was wondering if it would be easier to start with a simplified syntax.

After thinking about this more, it only needs to return a truthy value, so [True] & [] are fine too. I hope this gives it a better chance at succeeding. See the GitHub repository for more information.

Of all the ways to achieve this, there is no notation that is better than the others in every respect.
There is a trade-off between speed and brevity, the more code you use the faster it gets.

I updated this on the GitHub repository as well.

And as stated previously, replacing all uses of all() & any() where speed matters is not ideal. We’re making the code longer and you can’t inline a for loop or return a value to the outer loop. This is regression instead of progression.

But it’s suprising common as far as I can tell from my basic GitHub search.

Not happening, but you probably already knew that:

Nice Zombies:

Itertools.takewhile but in a list comprehension
result = [value for value in range(stop) while value < 10]
Would be roughly equivalent to:
result = []
for value in range(stop):
    if value >= 10:
        break
    result.append(value)
Feedback:

Cognitive overload

Uncommon

Intuitive

Breaks duality with for loops

I think that’s even more confusing than the syntax I’m proposing and “until” reads like “while not”.

Fixed, it returns an iterable now.

Is just in time compilation the solution to everything now?

Nineteendo · May 15, 2024, 8:05am

I updated the benchmarks on the GitHub repository, the difference is a bit smaller now, but still significant.

dg-pb · May 15, 2024, 8:14am

I think benchmarks should be done against:

any(True for m in maps if k in m)

which exists and is much more performant for large sizes. And not

any(k in m for m in maps)

Nineteendo · May 15, 2024, 8:29am

That’s not representative of the code used on GitHub.

<3.2k files with any(True ...) & for
<522 files with all(False ...) & for

Mostly because it’s ugly and not intuitive, but here are the numbers anyway:

1.81-2.16x faster for 1 item
1.44-1.60x faster for 10 items
1.05-1.09x faster for 100 items
No difference for 1000 items

The measurements were already in the expandable details.

Nineteendo · May 15, 2024, 8:32am

Should I put them between brackets? (New post as edits aren’t allowed).

Nineteendo · May 15, 2024, 8:33am

Correction: parentheses.

Nineteendo · May 15, 2024, 8:38am

Can JIT compilation already optimise this?

wannes@Stefans-iMac cpython % main/python.exe -m timeit -s "value = 10" "not(not(value))" && main/python.exe -m timeit -s "value = 10" "bool(value)"
50000000 loops, best of 5: 8.89 nsec per loop # not(not())
10000000 loops, best of 5: 21.5 nsec per loop # bool()

Nineteendo · May 15, 2024, 8:56am

Hmm, with tuple comprehensions it would be even faster:

2.55x (2.39x) slower for 1 item

1 item
1000000 loops, best of 5: 352 nsec per loop
1000000 loops, best of 5: 352 nsec per loop
1000000 loops, best of 5: 330 nsec per loop
1000000 loops, best of 5: 336 nsec per loop
2000000 loops, best of 5: 138 nsec per loop
2000000 loops, best of 5: 138 nsec per loop

2.29x (1.69x) slower for 10 items

10 items
500000 loops, best of 5: 644 nsec per loop
500000 loops, best of 5: 653 nsec per loop
500000 loops, best of 5: 475 nsec per loop
500000 loops, best of 5: 477 nsec per loop
1000000 loops, best of 5: 281 nsec per loop
500000 loops, best of 5: 282 nsec per loop

2.13x (1.11x) slower for 100 items

100 items
100000 loops, best of 5: 3.39 usec per loop
100000 loops, best of 5: 3.41 usec per loop
200000 loops, best of 5: 1.76 usec per loop
200000 loops, best of 5: 1.76 usec per loop
200000 loops, best of 5: 1.59 usec per loop
200000 loops, best of 5: 1.59 usec per loop

1.73x (1.02x) slower for 1000 items

1000 items
5000 loops, best of 5: 40.3 usec per loop
5000 loops, best of 5: 40.6 usec per loop
10000 loops, best of 5: 23.7 usec per loop
10000 loops, best of 5: 23.7 usec per loop
10000 loops, best of 5: 23.3 usec per loop
10000 loops, best of 5: 23.5 usec per loop

Nineteendo · May 15, 2024, 9:05am

BUT, tuple comprehensions should only be added if there are other use cases.
While the increased performance is nice, it’s not required for this proposal.

Nineteendo · May 15, 2024, 9:26am

As far as I can tell all() & any() are only returned or assigned in 10% of the cases:

Roughly 289k files with return any()/=any() & for
Roughly 226k files with return all()/=all() & for

dg-pb · May 15, 2024, 9:38am

Maybe not, but if people are not using efficient variant, it is a problem of education, not optimisation.

dg-pb · May 15, 2024, 9:46am

To add: this could contribute to readability argument, but it is not fair to benchmark against inefficient variant

Nineteendo · May 15, 2024, 9:55am

It’s not used anywhere in the stdlib and undocumented. You can only find this on StackOverflow if you compared the performance of all(... for ... in ...) / any(... for ... in ...) with a for loop.

What looks the cleanest? I think 2 (not 13) extra characters is a fine price to pay for the best performance:

if all(value in values for values in values_list)
if [value in values for each values in values_list]
if all(False for values in values_list if value not in values)

I think the performance improvement over the most commonly used variant should at least be mentioned. I can put the fasted alternative first though.

dg-pb · May 15, 2024, 10:02am

As long as the fastest available variant is not excluded from benchmarks.

Nineteendo · May 15, 2024, 10:04am

It never was, I just didn’t calculate the performance improvement. I have fixed this an hour ago.

Nineteendo · May 15, 2024, 10:08am

I hope you’re fine with not including map() as it’s not generally applicable.

pf_moore · May 15, 2024, 10:38am

Or of efficiency simply not being as important as readability. Not every optimisation is critical - after all, if they were we’d all be writing assembler.

Nineteendo · May 15, 2024, 10:53am

@pf_moore, does this solve your issue?

def func(stop):
    for value in range(stop):
        if value < 0:
            return [True]
    return []

result = func(stop)

Nineteendo · May 15, 2024, 10:58am

Or at least one of your issues with my proposal.

pf_moore · May 15, 2024, 11:17am

No. The construct is still logically returning a single value, but it looks like it generates multiple values.

As I said, though, I don’t intend to get into further debate on this proposal. I’m -1 on it, and I don’t think it’ll get accepted.