Introduce funnel operator i,e '|>' to allow for generator pipelines

# No extensions in standard library required:
rpipe(pd.read_csv(...)) | MC.query("A > B") | MC.filter@P(items=["A"], _) | MC.to_numpy | MC.flatten | MC.tolist | map@P(lambda x: x + 2) | list | np.array | MC.prod | rpipe
pd.read_csv(...) |x| MC.query("A > B") |x| MC.filter@P(items=["A"], _) |x| MC.to_numpy |x| MC.flatten |x| MC.tolist |x| map@P(lambda x: x + 2) |x| list |x| np.array |x| MC.prod |x| rpipe |x

# Pure addition of `|>` operator
X = MethodCaller()
pd.read_csv(...) |> X.query("A > B") |> X.filter@P(items=["A"], _) |> X.to_numpy |> X.flatten |> X.tolist |> map@P(lambda x: x + 2) |> list |> np.array |> X.prod

# Yours
pd.read_csv{...} |> .query{"A > B"} |> .filter{items=["A"], ?} |> .to_numpy |> .flatten |> .tolist |> map{lambda x: x + 2, ?} |> list |> np.array |> .prod

Using only |> addition in combination with available utilities

(pd.read_csv("my_file")
    |> X.query("A > B")
    |> X.filter@P(items=["A"], _)
    |> X.to_numpy().flatten().tolist()
    |> map@P(lambda x: x + 2)
    |> list
    |> np.array
    |> X.prod()
)

Nice. However the ? would not come with the |> operator if we follow through with the component approach… it would have to be part of a “new lambda”. PS. I think we should keep placeholders in this example even if they don’t make sense - to maintain generalizability.

To me the lack of MethodCaller and @P is a clear win but YMMV of course. We shoud also benchmark against the regular

_ = ...
_ = ...(_)

sequence I guess. Character count matters.

I am not sure what that specific case needs to do. |> .filter{items=["A"], ?} where does value get piped to? As object of which the method is called or sourced into second argument of filter?

I removed ? from that place in that post. I think it is a simple method call given your previous examples.

It’s a dummy example. Should come up with a better one.

Ok, I put them back and added partial to that place. As long as we know that that specific place doesn’t make sense and it is only there to show a placeholder in partial.

1 Like

BTW. Operator alone looks increasingly viable with the ____ shorthands etc. Keeping the partials high-level and having to manage symbols like that is not as tidy as I would like - but it could be a start.

Let’s put placeholder in a place that makes sense after all…

(pd.read_csv("my_file")
    |> X.query("A > B")
    |> X.filter(items=["A"])
    |> X.to_numpy().flatten().tolist()
    |> map@P(lambda x: x + 2)
    |> list
    |> np.array@P(_, order='K')
    |> X.prod()
)

Could you put together an example with argument placeholder(s), keyword argument placeholder(s), star expression placeholder(s) and and double star expression placeholder(s) mixed in different sequences? I think it is not necessarily intuitive in which order those would be passed to the resulting call.

Also, do we need ___ = functools.KwdsPlaceholder ? Can’t we just have X.whatever(keyword=functools.Placeholder) ?

Similarly some magic should be possible to be able to do X.whatever(*functools.Placeholder) and X.whatever(**functools.Placeholder) no? Could Placeholder have conversions to list / tuple and dict respectively? Then the tuple / dict should contain a special marker.

1 Like

This is for a double star expansion. And those are just my initial scribblings, not the final concept. If I was to implement then a lot of polishing and rethinking would need to be done.

So, as I said, this is only my latest scribblings, but the latest logic that I have in mind is as follows:

_ = Placeholder
__ = StarPlaceholder
___ = DoubleStarPlaceholder

def foo(*args, **kwds):
    return args, kwds

1 |> foo@P(_, 2)      # ((1, 2), {})
[1, 2] |> foo@P(__)   # ((1, 2), {})

1 |> foo@P(2, a=_)    # ((2,), {'a': 1})
{'a': 1, 'b': 2} |> foo@P(___)    # ((), {'a': 1, 'b': 2})

I am still in process of what would exactly happen when all of them are present:

foo@P(_, _, 1, __, ___, a=_, b=_, c=1)

But for piping there is always one and only one placeholder, so not worth going into this too much now.

I see where you are aiming at, and for sure - actually using * as if it was a call would be ideal.
But I think we need a realistic case. It is uncertain what this needs to do:

X.whatever(*functools.Placeholder)

For piping it is either:

foo@P(__)
# or
X.whatever(<complete set of arguments>)
# as X is placeholder here

I’d like to have this:

_ = Placeholder

def foo(*args, **kwds):
    return args, kwds

1 |> foo@P(_, 2)      # ((1, 2), {})
[1, 2] |> foo@P(*_)   # ((1, 2), {})

1 |> foo@P(2, a=_)    # ((2,), {'a': 1})
{'a': 1, 'b': 2} |> foo@P(**_)    # ((), {'a': 1, 'b': 2})

For *_ it’s possible - just checked. **_ is problematic. **_ is also easy after all.

1 Like

Nice one! Indeed:

class Placeholder:
    ...
    def __iter__(self):
        return iter((StarPlaceholder,))

Yup, might be possible to find a way for this too. But ** is used much less often - wouldn’t be tragic if there was no solution as such.

Tell me please? :slight_smile:

>>> class X(dict):
...   def __init__(self):
...     super().__init__()
...     self['_'] = 'DoubleStarPlaceholder'
...   def __iter__(self):
...     yield 'StarPlaceholder'
...
>>> x=X()
>>> dict(x)
{'_': 'DoubleStarPlaceholder'}
>>> list(x)
['StarPlaceholder']
>>>
>>> def fn(*args, **kwargs):
...   return (args, kwargs)
...
>>> fn(*x, **x)
(('StarPlaceholder',), {'_': 'DoubleStarPlaceholder'})

_ is a valid argument name unfortunately… Although very unlikely to occur, but such unlikely to be accepted in stdlib.

def foo(_=1):
     pass

True but that’s not the clue of the solution. It can be a dict containing DoubleStarPlaceholder as any value. The key can be None or "" or anything else that does not work as argument name. Then the interpretation should be done by your partial.

1 Like

Ok, this works. Must be a string so None does not. So that is on the table:

In [81]: def foo(**kwds):
    ...:     return kwds
    ...:

In [82]: foo(**{'': 1})
Out[82]: {'': 1}
1 Like

I agree. The |> syntax looks too much like an operator for it to not behave like one.

I think you’re going to get a lot of resistance to anything that’s not simple ASCII. In general, anything that is too punctuation heavy doesn’t look natural in Python.

Personally, I’m fine with |>, but I’d want it to be a normal operator, so I could mix it with method calls the way @petercordia showed. I’d prefer a syntax that could do |> map(~ + 2) or map(lambda x: x + 2) rather than map(lambda x: x + 2, ~).

I’m quite impressed by the basic |> proposal, but at the moment it feels like a prototype, and the final changes needed to make it fit well with the language are going to be the difficult part. I don’t particularly like the direction @dg-pb is going in - for me, that’s moving away from the elegance and simplicity a proposal like this needs to be successful.

4 Likes

I am just trying to keep this as simple as possible. i.e. binary |> operator that behaves as any other and there is nothing particularly special about it. And satisfy the rest of chaining needs via improving utilities without further syntactic changes.

And yes, it is definitely not as elegant as if it was made into a larger specialised construct, but given relative cost, to me, this doesn’t look like a particularly bad direction (keeping in mind that various parts of it can be improved later - further syntactic conveniences, optimizations, new utilities, etc…).

So given all that has been considered here, I would likely be open to extending partial in parallel to |> being implemented as a simple binary operator.


However, having that said, I think by now @sadaszewski and others involved have digested my POV - I am open to exploring other directions. And if there are better ones - thats great. There are surely ways to make this more elegant - I am just not sure what sort of complexity such additions would require.

I meant real-life examples, either from CPython or from projects in the wild. These should demonstrate whether the new syntax is truly syntax sugar, making code easier and more readable, or just another syntax without significant benefit.

>>> lst = [1, 2, 3]
>>> for index, item in lst |> enumerate():
...     print(index, item)
>>> lst = [1, 2, 3]
>>> for index, item in enumerate(lst):
...     print(index, item)

Which of these examples is more readable?