Builtins.any performance

dg-pb · May 13, 2024, 8:20am

Made a bit more progress and have an ok picture now. I tested my idea above regarding the use of partial and map version with only 1 arg, which removes a bit of overhead.

What has been done:

To make use of existing functools.partial I implemented contains_rev function with arguments swapped.
Modified map (named map1), which only accepts 1 iterable. Removing some complexity improved its performance.

Results:

def any_loop_(maps, key):
    for m in maps:
        if key in m:
            return True
    return False

# ------------------------------------
k = 0
maps = [{}] * k + [{k: k}]
%timeit any(k in el for el in maps)             # 611
%timeit any(True for el in maps if k in el)     # 610
%timeit any(map(contains, maps, repeat(k)))     # 358
%timeit any_loop_(maps, k)                      # 159

pred = ftl.partial(contains_rev, k)
%timeit any(map(pred, maps))                    # 160
%timeit any(map1(pred, maps))                   # 146

# ------------------------------------
k = 5
maps = [{}] * k + [{k: k}]
%timeit any(k in el for el in maps)             # 922
%timeit any(True for el in maps if k in el)     # 821
%timeit any(map(contains, maps, repeat(k)))     # 514
%timeit any_loop_(maps, k)                      # 321

pred = ftl.partial(contains_rev, k)
%timeit any(map(pred, maps))                    # 333
%timeit any(map1(pred, maps))                   # 281

# ------------------------------------
k = 10
maps = [{}] * k + [{k: k}]
%timeit any(k in el for el in maps)         # 1300
%timeit any(True for el in maps if k in el) #  906
%timeit any(map(contains, maps, repeat(k))) #  634
%timeit any_loop_(maps, k)                  #  471

pred = ftl.partial(contains_rev, k)
%timeit any(map(pred, maps))                #  456
%timeit any(map1(pred, maps))               #  405

# ------------------------------------
k = 25
maps = [{}] * k + [{k: k}]
%timeit any(k in el for el in maps)         # 2200
%timeit any(True for el in maps if k in el) # 1150
%timeit any(map(contains, maps, repeat(k))) # 1000
%timeit any_loop_(maps, k)                  #  924

pred = ftl.partial(contains_rev, k)
%timeit any(map(pred, maps))                #  930
%timeit any(map1(pred, maps))               #  795

# ------------------------------------
k = 50
maps = [{}] * k + [{k: k}]
%timeit any(k in el for el in maps)         # 4.0 µs
%timeit any(True for el in maps if k in el) # 2.0 µs
%timeit any(map(contains, maps, repeat(k))) # 1.5 µs
%timeit any_loop_(maps, k)                  # 1.6 µs

pred = ftl.partial(contains_rev, k)
%timeit any(map(pred, maps))                # 1.7 µs
%timeit any(map1(pred, maps))               # 1.4 µs

# ------------------------------------
k = 100
maps = [{}] * k + [{k: k}]
%timeit any(k in el for el in maps)         # 6.0 µs
%timeit any(True for el in maps if k in el) # 3.3 µs
%timeit any(map(contains, maps, repeat(k))) # 2.6 µs
%timeit any_loop_(maps, k)                  # 2.9 µs

pred = ftl.partial(contains_rev, k)
%timeit any(map(pred, maps))                # 3.2 µs
%timeit any(map1(pred, maps))               # 2.6 µs

# ------------------------------------
k = 1000
maps = [{}] * k + [{k: k}]
%timeit any(k in el for el in maps)         # 63 µs
%timeit any(True for el in maps if k in el) # 29 µs
%timeit any(map(contains, maps, repeat(k))) # 23 µs
%timeit any_loop_(maps, k)                  # 28 µs

pred = ftl.partial(contains_rev, k)
%timeit any(map(pred, maps))                # 30 µs
%timeit any(map1(pred, maps))               # 26 µs

# ------------------------------------
k = 10_000
maps = [{}] * k + [{k: k}]
%timeit any(k in el for el in maps)         # 645 µs
%timeit any(True for el in maps if k in el) # 284 µs
%timeit any(map(contains, maps, repeat(k))) # 230 µs
%timeit any_loop_(maps, k)                  # 279 µs

pred = ftl.partial(contains_rev, k)
%timeit any(map(pred, maps))                # 304 µs
%timeit any(map1(pred, maps))               # 251 µs

# ------------------------------------
k = 100_000
maps = [{}] * k + [{k: k}]
%timeit any(k in el for el in maps)         # 5.5 ms
%timeit any(True for el in maps if k in el) # 2.8 ms
%timeit any(map(contains, maps, repeat(k))) # 2.3 ms
%timeit any_loop_(maps, k)                  # 2.9 ms

pred = ftl.partial(contains_rev, k)
%timeit any(map(pred, maps))                # 3.0 ms
%timeit any(map1(pred, maps))               # 2.6 ms

Benchmark results:

Suggested approach performs well for very short lists. Loop is still slightly faster for iterables of size lower than 5. (any_loop_ function has extra function call (subtract ~50ns to compare).
Suggested approach performs best for medium size iterables - 5-100. (Or 25-100 if predicate can not be pre-stored)
Suggested approach performs 2nd best for iterables larger than 100.
One main drawback of suggested approach is that the construction of predicate is costly. The times above excludes that cost, which is ~150ns for this particular problem.
Suggested approach successfully addressed the indicated problem of “efficient predicated short-circuiting for iterables lower than size 50 without using loops”. However, for very short sequences loops would still be most optimal.

Suggested improvements:

First step to benefit from these findings would be to improve flexibility of functools.partial. This would allow efficiently constructing predicates from performant functions that exist in the standard library. Especially the ones in operator module. There have been many different attempts to implement more flexible variants of partial, which proves that such functionality is desired. To name a few:
- partiell · PyPI
- partial-apply · PyPI
- partial · PyPI
- better-partial · PyPI
  All of them are implemented in pure python and do not offer desired performance. Proposed changes:
1. A simple placeholder functionality would cover most of the needs.
2. Ability to logically aggregate partial functions would cover 99% of cases. I.e. predicate = partial(isinstance, _, str) | partial(lt, _, 5) & partial(gt, _, 2)
Performance of map can be improved by implementing more basic version with 1 argument. It could also be worthwhile exploring whether switching to simplified behaviour if only 1 iterable was provided could benefit performance of existing map function.

Summary:

The benefit of functools.partial improvement is extended functionality that enables more efficient usage of existing standard library functions (mainly operator module).
Simplified map offers up to 20% better performance for 1-argument cases.
With both improvements implemented, suggested solution to the problem would provide best performance for a certain range of problems and at least 2nd best performance for problem of any size.