`functools.partial` placeholder arguments

dg-pb · May 16, 2024, 8:01am

Although this was my initial goal, I have first explored idea of functools.argorder, to be used with functools.partial. Addition: `functools.argorder` It offers some benefits, but is not suitable for standard library due to various reasons.

So the only thing that was left is to actually propose a change to functools.partial.

In short, it is a functionality extension of functools.partial, which allows positional arguments to be placeholders, so that postitional arguments of a call fill those places first.

There are number of libraries implementing this in pure python, but none of them are efficient. And efficiency in this case is one of the main drivers. Some of those packages:

Also, this was already proposed. Functools.partial extension to support specific positional arguments The main oppositions were:

Just use lambda for this

What led me here is the fact that functional toolkit is inefficient for small sizes of iterables. So part of the problem this is solves is that it improves upon lambda performance so that functional toolkit such as map in combination with any can be used for any size iterables being sure that it is among the most performant options. However, it has been proven that for short size iterables loops greatly outperform any other method and there are cases where it can be of significant importance.

This has been discussed and laid out in: Builtins.any performance - #29 by dgrigonis

Another argument against was that “Extension would lead to performance decrease”

However, the performance decrease is negligible as can be seen below:

from functools import partial
from partial2 import partial as partial2

import unittest.mock as utm
_ = VOID = utm.sentinel.VOID


p1 = partial(opr.sub, 1)
p2 = partial2(opr.sub, 1)
p3 = partial2(opr.sub, _, 1)
p4 = partial2(opr.sub, VOID, 1)

print(p1(2))    # -1
print(p2(2))    # -1
print(p3(2))    # 1
print(p4(2))    # 1

%timeit p1(2)   # 48 ns
%timeit p2(2)   # 52 ns
%timeit p3(2)   # 54 ns
%timeit p4(2)   # 54 ns

So performance has not suffered much and there are still couple of places for optimisation.

Implementation

Implementation is straightforward and I have not found any issues with it.
There is a restriction for number of positional arguments sourced to new callable to be higher or equal than the number of placeholders. This ensures that there is no ambiguity regarding creation of joint argument tuple.

Use case

from functools import partial
from partial2 import partial as partial2
from hello import ilen2 as ilen

import unittest.mock as utm
_ = VOID = utm.sentinel.VOID


p1 = partial(opr.sub, 1)
p2 = partial2(opr.sub, 1)
p3 = partial2(opr.sub, _, 1)
p4 = partial2(opr.sub, VOID, 1)

print(p1(2))    # -1
print(p2(2))    # -1
print(p3(2))    # 1
print(p4(2))    # 1

%timeit p1(2)   # 48 ns
%timeit p2(2)   # 52 ns
%timeit p3(2)   # 54 ns
%timeit p4(2)   # 54 ns


from operator import contains
pred = lambda d: contains(d, 9)
pred2 = partial2(contains, _, 9)


a = [{i: i} for i in range(10)]
ilen(filter(pred, a))       # [{9: 9}]
ilen(filter(pred2, a))      # [{9: 9}]
%timeit ilen(filter(pred, a))       # 784 ns +86%
%timeit ilen(filter(pred2, a))      # 421 ns
# -----
b = [{i: i} for i in range(50)]
%timeit ilen(filter(pred, b))       # 3.76 µs +135%
%timeit ilen(filter(pred2, b))      # 1.59 µs
# -----
b = [{i: i} for i in range(100_000)]
%timeit ilen(filter(pred, b))       # 3.43 ms +100%
%timeit ilen(filter(pred2, b))      # 1.64 ms

And use case which led me here:

def any_loop_(maps, key):
    for m in maps:
        if key in m:
            return True
    return False

# ------------------------------------
k = 0
maps = [{}] * k + [{k: k}]
# LOOPS                                         N     1      5     50    100   100K
%timeit any(k in el for el in maps)             # 610ns  800ns  3.6µs  6.1µs  5.5ms
%timeit any(True for el in maps if k in el)     # 610ns  790ns  1.9µs  3.4µs  3.0ms
%timeit any_loop_(maps, k)                      # 160ns  265ns  1.6µs  3.0µs  2.8ms
# FUNCTIONALS
pred = lambda m: k in m
pred2 = partial2(contains, _, k)
%timeit any(filter(pred, maps))                 # 190ns  465ns  2.9µs  5.5µs  5.4ms
%timeit any(filter(pred2, maps))                # 160ns  300ns  1.5µs  2.9µs  2.7ms
%timeit any(map(contains, maps, repeat(k)))     # 360ns  400ns  1.4µs  2.6µs  2.2ms

For the problem above, without this addition there was only 1 performant functional approach - map + repeat. Using lambda was ok for iterable size up to 5, but that is all.

This proposal adds 1 more performant solution for this problem. Although it is still not as fast as loop, for short sizes, but its performance is competitive across all sizes.

However, while this is a problem that I was concentrating on, functools.partial is a general utility and this functionality would be applicable to many other curry cases.

dg-pb · May 16, 2024, 11:11am

def any_base(maps, key):
    return any(key in el for el in maps)

def any_true(maps, key):
    return any(True for el in maps if key in el)

def any_map(maps, key):
    return any(map(contains, maps, repeat(key)))

def any_loop(maps, key):
    for m in maps:
        if key in m:
            return True
    return False

def any_prtl_filt(maps, key):
    return any(filter(partial2(contains, _, key), maps))

┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃                   10 repeats, 10,000 times                          ┃
┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃     Units: ns        0        5       10       50      100     1000 ┃
┃               ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
┃      any_base ┃    666     1697     2657    10171    19829   206378 ┃
┃      any_true ┃    688     1560     2415     9128    17319   179685 ┃
┃       any_map ┃    393     1298     2099     8801    16969   178805 ┃
┃      any_loop ┃    165     1116     1947     8559    16713   178538 ┃
┃ any_prtl_filt ┃    306     1171     1995     8859    17465   183653 ┃
┗━━━━━━━━━━━━━━━┻━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

pf_moore · May 16, 2024, 12:29pm

I’ll note that in the thread you linked to @rhettinger was strongly opposed to this idea. As the expert on the functools module you’re going to have a very hard time getting this accepted over his objections.

So anyone interested in participating in this discussion should be aware that it has almost zero chance of being accepted.

@dg-pb as a genuine question, given that you’ve seen the previous thread and are aware of Raymond’s opposition, why are you even proposing this idea again? I see no indication that you have anything new here that might change his mind. Are you unaware of Raymond’s role? Should we be doing more to direct people to actively look for evidence of the relevant core dev expert’s views on their proposal before they post? Do people need to be reminded how to identify which participants in discussions are core developers?

Or are people (this is a more general point - I’m not meaning you here) simply posting in this category without having the most basic understanding of how Python is developed? And if so, what are we doing wrong? Surely Mozilla don’t get people posting in their developer forums saying “hey, why don’t we use Python instead of Javascript in the browser?”

dg-pb · May 16, 2024, 1:49pm

I have made the decision being well aware of @rhettinger’s opposition the last time this was proposed.

This is exactly what this implements.

I addressed a certain part of opposition from back then and I think that this proposal adds new information too:

My research suggests that there is a considerable amount of repetition implementing such functionality (this has not been emphasised last time). At the same time, none of them are implemented efficiently.
I have a working code with benchmarks and a case for performance, which wasn’t the focus last time. Performance of such python implementation compared to C extension is more than 10x times higher. This also provides ability to efficiently make use of performant functions in standard library at a fraction of cost. Together these can be a powerful toolkit for cases where performance is crucial.
I have found that implementation does seem much more simple and robust than I initially though. I would dare to guess that implementation and maintenance cost might have been overestimated at the time.
Finally, this suggestion came from a certain endeavour to improve performance of particular problem so I hope this can be seen as positive contributing factor to the case.

So I would be happy to receive a sincere reconsideration as I currently genuinely think this could be a valuable addition.

pf_moore · May 16, 2024, 3:57pm

Cool. As I say, I suggest submitting it as a PR and seeing how it goes. Best of luck!

dg-pb · May 18, 2024, 1:34pm

There is a python version to try out:

gist.github.com

https://gist.github.com/dgrigonis/849bbf8988767a4cd63e229ae38e1b72

functools_partial.py

import unittest.mock as utm
from reprlib import recursive_repr
_ = VOID = utm.sentinel.VOID


class partial:
    """New function with partial application of the given arguments
    and keywords.
    """

This file has been truncated. show original

Also, fun curry class.

class curry:
    def __init__(self, func, *args, **kwds):
        self._ = c = partial(func, *args, **kwds)

    def __call__(self, *args, **kwds):
        return curry(self._, *args, **kwds)

f = lambda a, b, c: a - b - c
c = curry(f)
print(c(1)(2)._(3))             # -4
print(c(_, _, 3)(_, 2)._(1))    # -4

Also, when using C implementation:

# Original call
%timeit f(1, 2, 3)      # 75 ns

# Current partial of functools
import functools
p = functools.partial(f, 1, 2)
%timeit p(3)            # 103 ns

# Curry with partial with placeholders
h = c(_, _, 3)(_, 2)
%timeit h._(1)          # 135 ns
# Without `getattr` overhead
h_ = h._
%timeit h_(3)           # 110 ns

chrisgrimm · May 27, 2024, 1:43pm

i’m still a big fan of this as an addition to the language, provided it can be done somewhat efficiently.

dg-pb · May 28, 2024, 11:39am

One case that I am running into that this could be useful for in certain cases:

lines = [s.lstrip('\n') for s in lines]
# Could be done with
lines = map(partial(str.strip, PH, '\n'), lines)

blhsing · May 28, 2024, 11:50am

For this particular use case it may be currently done with:

map(methodcaller('strip', '\n'), lines)