`functools.pipe` - Function Composition Utility

sayandipdutta · November 25, 2024, 6:16pm

Yeah, I meant to say the same

Ah, I see.

If a dunder method is introduced, I’d say x.__rcall__ would be the most logical, since it will call f.__call__. But someone could take that to mean f(x) <-> x(f) when f doesn’t define __call__ and x defines __rcall__, which is nonsensical. Somewhat similarly, with x.apply, it sounds like x is being applied, granted we can document it to mean f(x), I’m not sure it comes across intuitively. So I would propose the following:

I think __apply__ and __reverse_apply__ makes more sense. Where x |> f <=> x.__reverse_apply__(f).

dg-pb · November 25, 2024, 6:36pm

Yeah I see your point. If we leave __call__ for “infix application”, then things are messed up.

I agree with __apply__, but “reverse application”, IMO begs for different naming.

How would you name “reversed reverse apply”? __rreverse_apply__? Very long.
And has double “reverse” in it - unnecessary mind bender IMO.

Maybe __insert__ or __input__?
E.g. “Inserting/inputting argument into function”

dg-pb · November 25, 2024, 7:49pm

Gave a one more thought about this. Something doesn’t sit right.

__input__ and __insert__ just don’t feel right as they are usually used in very different contexts and have little to do with functional programming.

I don’t think it is the case. x.__rcall__(y) == y(x). Default __rcall__ implementation would look like:

def __rcall__(self, f):
    return f(self)

And in practice, __call__ and __rcall__ are all what is needed.

Application

result = f @@ x

# logic
if `f` defines `__call__`:
    result = f.__call__(x)
else:
    result = x.__rcall__(f)

Reverse Application

result = x |> f

# logic
if x defines `__rcall__`:
    result = x.__rcall__(f)
else:
    result = f.__call__(x)

The rest is parser logic, operator precedence and direction (i.e. x |> f |> g - left → right, while g @@ f @@ x - right → left). @@ application is only conceptual and I don’t think there is much value in it, but “reverse application operator” might be useful and its behaviour is well in line with the rest of operators.

So all what is needed to implement |> is __rcall__.
And __rcall__ would have no impact upon usual (), but __call__ would be used as part of |>.

So all this avoids necessity of introducing new naming.

And personally, in my proxy objects I will just continue using apply as non-dunder for __rcall__.

Reason being:
Just realised that there is no strict rule whether method name primarily refers to object or argument.

list.sort() means “sort list”
But:
list.insert(x) does not mean “insert list into x”
Same as:
x.apply(f) does not mean “apply x on f”.

dg-pb · May 19, 2025, 9:50pm

@jamsamcam I think what I have gathered at the time is that those who looked at it and closed it misunderstood proposal. Proposal does not include any operators. It is class pipe with 3 methods - the very first code snippet in Addition: functools.pipe · Issue #127029 · python/cpython · GitHub.

Furthermore, I asked some questions and never got a response, so I was like oh well, implemented it efficiently for myself and forgot. I have been using it quite a lot and by now I am quite confident that it is good addition. Also, other languages have the same and everyone is happy.

To resubmit I think what is needed is:

Don’t overload it with any examples that no one wants to read and gets confused by what is the actual proposal. Just include the concept of suggested functionality with supporting textual justifications for decisions made and rationale.
Address feedback:
1. Typing - needs to be looked into (the only valid feedback)
2. Performance - the proposal is C implementation - so already addressed
3. DSL - irrelevant (suggested implementation has no operators)
4. Async support - irrelevant

jamsamcam · May 19, 2025, 9:58pm

So I’ve had a read through this and I think the proposal has ended up overly complex and a bit off track.

So here is my more focused suggestion. For the context of why this function would be useful you can read the many examples here Introduce funnel operator i,e '|>' to allow for generator pipelines - #166 by jamsamcam

So if we strip this function to its conceptual core we are trying to construct an array of partials which are then wrapped in a function which takes an item through each partial passing the item from the last step through all of them until its transformed into the final result (basically a bit like reduce)

If we keep it simple and follow convention over configuration we can say that each function will be passed the item as the last element in it’s list of *args.

We can declare such a function like so

def pipe(*funcs):
    return lambda initial_value: reduce(lambda acc, func: func(acc), funcs, initial_value)

item_pipeline = pipe(
   partial(map, lambda a: item.name),
   partial(filter, lambda b: “delete red” not in b),
   list
)

processed_items = item_pipeline(items)

This gives us a pretty good default pipe function using existing concepts form the functool library wihtout massively complex implementations.

Providing a simple implementation like this could be a nice convenience for people who don’t want to declare this in all of their projects or import complicated pipe library and doesn’t feel like a huge cost to functools

In that conversaion people also mentioned they wished to have the result of the pipe to be on the right hand side, for this we can use a context manager based object:

class pipe:
    def __init__(self, *funcs):
        self.funcs = funcs
        self.result = None
    
    def __call__(self, initial_value):
        """Allows direct invocation like a function."""
        return reduce(lambda acc, func: func(acc), self.funcs, initial_value)

    def __enter__(self):
        """Context manager entry — simply return the result."""
        return self.result
    
    def __exit__(self, exc_type, exc_val, exc_tb):
        """Context manager exit — nothing special needed."""
        pass
    
    def __getitem__(self, initial_value):
        """Allows `with p(6) as result:` syntax."""
        self.result = self(initial_value)
        return self

p = pipe([
    lambda x: x + 2,
    lambda x: x * 3,
    lambda x: x - 1
])

# As a callable:
print(p(5))  # Output: 20

# As a context manager:
with p(6) as result:
    print(result)  # Output: 23

In this case we can pretty much get that syntax without declaring that much python code, I think if the initial version of this function is restricted to just this. This is powerful enough for most uses cases.

Dealing with arguments in different places etc can simply handled through passing in lambdas or partials.

If in the future special python syntax is added we can of course extend this object to support or indeed expose it as a native c class.

But for now I don’t thin the complexity is needed and would make it harder to justify including this function.

dg-pb · May 19, 2025, 10:03pm

No, it is simpler than it seems. I have overcomplicated the proposal and people did not understand what it is. In essence it is a bare minimum:

class pipe:
    def __init__(self, *funcs):
        self.funcs = funcs

    def __call__(self, obj, /, *args, **kwds):
        if not (funcs := self.funcs):
            assert not args and not kwds
            return obj
        first, *rest = funcs
        obj = first(obj, *args, **kwds)
        for func in rest:
            obj = func(obj)
        return obj
    
    def __get__(self, obj, objtype=None):
        if obj is None:
            return self
        return types.MethodType(self, obj)

This is all that is needed. These are the bits that would benefit most from C implementation and it is the core functionality. The rest - operators, context managing stuff, etc can be added by inheriting from the above:

class mypipe(functools.pipe):
    # whatever you need

jamsamcam · May 19, 2025, 10:13pm

I think I agree with most of it except the fact it passes the data in same format to first function

I think if it was same behaviour for all functions then not only is it easier to implement type annotations

Then easier conceptually, a user doesn’t have to find definition of pipeline to see what keywords do or have it suddenly break because steps change

I know theoretically that can happen if item type changes but I would argue that’s easier to manage

That would at least solve some of the feedback core developers gave you

dg-pb · May 20, 2025, 12:01pm

Well, my reasoning was as following:

It is a very simple low cost addition - signature becomes the one of the first function
It is useful for constructs such as:

def add(a, b):
    return a + b
add_plus_1 = pipe(foo, partial(add, 1))

So the above could be useful sometimes, but the more useful case:

class A:
    def keys(self):
        return iter([1, 2])
    key_list = pipe(keys, list)

Thus, __get__. This makes it in line with current partial, which can also conveniently be used to partialise methods having __get__.

So although this isn’t strictly necessary, but given the low cost of this and nice alignment with partial it seemed worth it. I have this implemented for myself in exactly the same manner and if functools addition didn’t have the same I would unlikely to use it - but this is just me - maybe for others this is not needed and it would be better to add it without it.

Regarding typing, I don’t think it is a big issue. Although I haven’t done any work with this, but I suspect it is much simpler than adding typing support for partial with Placeholder. typeshed/stdlib/functools.pyi at main · python/typeshed · GitHub

XoseLluis · May 22, 2025, 1:55pm

I’m 100% in favor of adding some sort of pipe function to the stdlib. Either something so simple like this (that I copy/paste in many of my projects)


def pipe(val: Any, *fns: list[Callable]) -> Any:
    def _call(val, fn):
        return fn(val)
    return functools.reduce(_call, fns, val)

to the much more elaborate and powerful options proposed in this post (and that are intended for the creation of “reusable” pipes, not like the above, intended for one-shot usage).

I know part of the Python community had a certain dislike for functional programming, but the world evolves. As other languages have introduced “soft” functional programming features and people work with those languages, when they (me included) use Python they expect to be able to use at least some of those features. Sure each language has its particularities and its own flavor, but things that become common ground in the programming world should be available in all mainstream languages.
The ideal solution to me would be a pipe operator (as proposed in another recent discussion and that step by step seems to be approaching to being approved for JavaScript), but that seems like a hard battle to fight.
A pipe function/class would be almost that helpful, and non disruptive for those not interested in it (or even disgusted by it).
So please, let’s enhance Python’s expresivity with this!

mikeshardmind · May 22, 2025, 7:59pm

Everything you need, including the most accurate possible annotations for it is right here:

# No license should ever be required on this.
# If you're required for inane reasons, you may assume the terms of your choice of:
# Apache-2.0, MIT, or MPL
from collections.abc import Callable, Iterable
from functools import partial, reduce
from typing import Any


def process(val: object, *funcs: Callable[[Any], Any]) -> Any:
    return reduce(lambda x, f: f(x), funcs, val)


def pipeline(*funcs: Callable[[Any], Any]) -> Callable[[Any], Any]:
    return lambda v: process(v, *funcs)


def app_filter[T](predicate: Callable[[T], bool]) -> Callable[[Iterable[T]], Iterable[T]]:
    return partial(filter, predicate)


def app_map[T, R](op: Callable[[T], R]) -> Callable[[Iterable[T]], Iterable[R]]:
    return partial(map, op)

pipeline there makes reusable pipelines, process is one-shot.

As you can see, anyone familiar with functional programming could have written this themselves, the fact that people usually don’t is because without syntax support, it’s not worth the tradeoffs. This immediately, and by neccessity, hampers all but the barest LSP and typechecking info and, in most cases, even in code bases where people are accustomed to not using IDEs or typecheckers, still less reviewable than code that names each intermediary.

The few remaining cases where this is actually an improvement in some form have a reason to not do this as well, array & tensor libraries can optimize certain things if they are aware of the full pipeline, so they have a benefit to implement their own builder patterns.

XoseLluis · May 23, 2025, 1:41pm

Implementing a pipe function is way simple as shown in the examples. The problem is that there are many possibilities (as shown throughout the discussion): one-shot or reusable, use one class that redefines operators (>>, |…) Even the name is not so clear (pipe, compose, process…), even just left to right or support also right to left.
So adding one of these options to the standard library (functools or wherever) would achieve just that, standarization.
As usual, most people interested in this functionality would adapt to the standard, and a few ones would roll their own solutions (like the different existing ones right now, for example Pipe21). But we would achieve a good degree of homogeneity.

pf_moore · May 23, 2025, 2:01pm

The contradiction between these two points is the reason it’s hard to get things added to the stdlib. Because something is added to the stdlib, it has to address all of the use cases people have - which means that it basically needs to provide all (or at least most) of those possibilities.

Just adding one option to the stdlib isn’t standardising, it’s claiming that other options aren’t important enough to deserve stdlib support. Sometimes that is the case - but you have to demonstrate it, you can’t just assume it.

dg-pb · May 23, 2025, 2:57pm

There are 2 main types of “pipes”:

1. Feed Pipe

# Custom object
# Standard syntax
result = rpipe(obj)(func1)(func2).obj
# Custom Operators
result = rpipe(obj) | func1 | func2 | UNWRAP
# Or:
RPIPE = rpipe()
result = obj | RPIPE | func1 | func2 | UNWRAP

The issue of adding this to standard library is that there exist some variations.
Also, there are some things that people would like it to do that are a bit tricky.

E.g.:

“feed pipe” can be eager or “lazy”
Some want it to support “inplace” evaluation

# Potential stdlib syntax
result = obj |> func1 |> func2

The issue with custom syntax is that addressing the above becomes even harder. E.g. rpipe class can have argument lazy, but where would the above syntax be parameterised?

So my current take is that “feed pipe” is tricky and costly and I would prefer and advise not to rush with things in this direction.

2. Function Composition Pipe

# Standard syntax
pipeline = pipe(func1, func2)
result = pipeline(obj)
# Custom operators
PIPE = pipe()
result = obj >> PIPE | func1 | func2

In comparison to (1) this is much more straight forward.
There are no big nuances in it.
The only variation that I see is whether to allow full signature of the first function.

Also, this is much more general object which is useful not only for “piping”, but also general function composition. E.g. map(pipe(int, str), iterable).

Also, having efficient implementation in stdlib, user could easily implement operators that would provide piping syntax which is fairly good.

So after going through another round of discussions in Introduce funnel operator i,e '|>' to allow for generator pipelines - #181 by elis.byberi I am back to my initial position.

implement simple efficient functools.pipe
further extend functools.partial

With (1) and (2) above user can do:

from functools import pipe, partial, Placeholder as _
class mypipe(functools.pipe):
    def __or__(...)...
    def __rrshift__(...)...
PIPE = mypipe()

result = obj >> (PIPE
    | func2
    | partial(zip, *_)          # Needs partial extension!
    | chain.from_iterable
    | sum
    | partial(operator.add, _, 1)    # Already available
)

So without any syntax changes:

Functionality needs are fully covered (after partial allows keyword, *args and **kwds placeholders)
Reasonable performance of all components
Easily implementable piping DSL via operators, which is pretty much as convenient as the one of the “feed pipe”.

dg-pb · May 23, 2025, 5:45pm

Also, operator DSL implementation for “composition pipe” does not suffer from needing to do the UNWRAP (as opposed to “Feed Pipe” object), making such syntax satisfactory (at least to me).

And “Function Composition Pipe” can do both:

Construct re-usable pipelines
Imitate “Feed pipe”

dg-pb · May 23, 2025, 6:38pm

Given functools.pipe existed, the following would be available off the shelf:

from itertools import batched
from functools import pipe, partial, Placeholder as _
from operator import neg

obj = [1, 2, 3, 4, 5, 6]
result = pipe(
    partial(batched, _, 3),
    partial(zip, *_),  # NOTE: needs partial extension
    partial(map, pipe(sum, neg)),
    max
)(obj)
print(result)    # -5

And if user wanted for a bit more pipe-like syntax, with couple of few-line utilities one could have:

obj = [1, 2, 3, 4, 5, 6]
result = obj >> (PIPE
    | batched@P(_, 3)
    | zip@P(*_)  # NOTE: needs partial extension
    | map@P(pipe(sum, neg))
    | max
)
print(result)    # -5

However, the one that I prefer the most is:

# Pre-store pipeline (regardless if one-off or to be reused)
pipeline = pipe(
    partial(batched, _, 3),
    partial(zip, *_),  # NOTE: needs partial extension
    partial(map, pipe(sum, neg)),
    max
)

obj = [1, 2, 3, 4, 5, 6]
result = pipeline(obj)
print(result)    # -5

This pattern, IMO, is suitable for production code.

pf_moore · May 23, 2025, 7:45pm

I’ll just point out that I couldn’t work out what any of those did. Would you mind writing the equivalent procedural version so that I can see what the intended behaviour was? Thanks.

dg-pb · May 23, 2025, 7:50pm

obj = [1, 2, 3, 4, 5, 6]
batches = batched(obj, 3)
pairs = zip(*batches)
pair_sums = map(sum, pairs)
negative_pair_sums = map(neg, pair_sums)
result = max(negative_pair_sums)
print(result)   # -5

XoseLluis · May 23, 2025, 8:07pm

That’s a real winner for me. We have “reusable pipes” and the “one shot” usage is really neat thanks to “>>”, as being able to provide obj as the first element of the flow seems really important to me.

dg-pb · May 23, 2025, 8:12pm

If this was to be implemented it wouldn’t implement operators off the shelf and user would have to do it himself if he wanted such DSL.

However, I think it might be good to give a recipe in docs of one good variation of how to do it.

pf_moore · May 23, 2025, 8:46pm

Thanks. Nesting pipes inside other pipes feels very confusing to me. And I’d avoid something like zip(*batched(obj,3)) like the plague as well - I can work it out, but I have to stop every time to re-analyse it. I certainly wouldn’t consider this example an argument in favour of a pipe function.

Obviously this is a completely artificial example, but in real world code I’d strongly argue for something less “dense”. For personal code that you don’t intend to share, do whatever you like, of course.