Introduce funnel operator i,e '|>' to allow for generator pipelines

Added some more use cases. Any comments so far?

A native pipeline syntax is uniquely suitable for building conditional pipelines and/or pipelines that can short-circuit (i.e. terminate early depending on a condition). For example:

x |> (_ ** 2 if _ <= 10 else (_ |> (_ + 1) |> _ ** 3 |> (_ / 2)))

or more realistically:

x |> (
  (
    _["content"] |> [ x["text"] for x in _ if "text" in x ] |> _[0]
      if len(_) > 0
        else None
  ) if isinstance(_.get("content"), list) else
    (
      _["content"] 
    ) if isinstance(_.get("content"), str) else
      None
)

It should be more flexible than match..case for example for matching items in the middle of a list. More importantly it can be nicely decomposed.

handle_list = |> (
    _["content"] |> [ x["text"] for x in _ if "text" in x ] |> _[0]
      if len(_) > 0
        else None
  )
handle_str = |> _["content"]

x = {'content': [{'text': 'foo'}]} 

x |> (
    handle_list(_) if isinstance(_.get("content"), list) else
    handle_str(_) if isinstance(_.get("content"), str) else
    None
)

>>> 'foo'

The fact that it is an expression as opposed to the match..case being a statement doesn’t hurt - they can be complementary.

It’s not mentioned in the PEP what the parameters to __pipe__() are (they even vary!)? That in turn makes it hard to tell from the def __pipe__() examples what makes it any more than just another operator to overload.

Minus the deep learning case, these don’t help to see a contrast between something that’s impossible/messy to write without the operator but clean/easy with it. I’d try doing more with |> vs without |> comparisons, making sure to use the cleanest/most readable form you can for your without code so that us readers aren’t siting here thinking well if I just rewrite that some other way and then I wouldn’t need the operator after all.

3 Likes

Nobody should use “Pythonic” to make arguments for or against features, just as nobody should make arguments which rest on shallow interpretations of the Zen.
These are useful shorthands in casual conversation, but highly subjective and imprecise.

“Pythonic” just means “has good vibes”. Usually, when it comes up, there’s an idea behind it which can be clearly stated without jargon, but requires more effort to elucidate.

The Python community and maintainers typically prize clarity and legibility very highly. Brevity is a secondary goal, and is useful to pursue when it can enhance readability. Beyond that, I don’t think there are pithy and easily stated principles. (The Zen is pretty good, but too many people stop thinking when they quote it.)

I’m not bothered, but perhaps a bit concerned for you when you raise the question of “what is Pythonic?” in this way. There is no underlying “truth” of what that word means, and if you see your task in proposing this feature as shifting that (illusory) definition, I think you will spend a lot of wasted time and effort.

Regarding these examples, I would say that the complexity you are showing here in some of these is not very helpful – actually, on the contrary, harmful. For example, large ternary if expressions are a red flag because they bury branching logic within expressions. Are they necessary to the examples? Could they be replaced with simple named functions? High cognitive complexity mixed with a new feature makes evaluation of the proposal harder.

When I see pipelines in R, the input is some simple data object – maybe the raw text of a CSV – and the output is some useful analysis – perhaps a data frame with statistics. Perhaps you could find similar Python code and demonstrate the usefulness of pipes in a simpler context?

I also find it very concerning that only one of the examples, “use case 2”, appears to be sourced from real existing code, and that analysis declares an improvement where I, as a reader, see an inferior result. The original had very clear control flow (linear series of method calls) and each expression is small and readable. Perhaps there’s some real improvement in there, but it’s extremely hard to identify, even with a very generous reading.


Most people have dropped off of this thread. I think most participants have given as much feedback and devoted as much volunteer time to this topic as they’re willing to do right now.

You should take a bit of time to address the most substantive criticisms in this thread before posting more. I’d rather come back to this topic and read significant progress all at once than get updates as minor changes are made, and I suspect that this feeling is shared by some other readers.

8 Likes

The variations are only due to lack of synchronization between examples. The examples with the most arguments (__pipe__(self, rhs, rhs_noinject, last, name, unparsed) are the ones that reflect the current state of the implementation.

Indeed, I should stress that what makes __pipe__() absolutely unique are the parameters. rhs and rhs_inject represent the right-hand side (respectively without / with parameter injection) which has not been evaluated yet. This is the core difference compared to other operators which can also act on the RHS but any deferral must be done manually (using partial or lambda), whereas here it is automatic. Futhermore, last informs whether the current call corresponds to the last stage of the pipeline. name is set to the corresponding identifier if named expression is used at the given stage of the pipeline, None otherwise. Last but not least, unparsed is a string representation of the RHS. It can be parsed using ast.parse and enables all sorts of customizations.

This makes the operator extremely powerful. To which use cases the syntax actually fits is another story. I agree that the Deep Learning example is the most convincing one so far, not exclusively because of its organic origin but because it simply expresses the story better after the rewrite (IMHO). Surprisingly, it is not a case where we gain more compactness. That’s a useful observation, I guess. Clarity and legibility before brevity, as mentioned by @sirosen .

Good point. I need more examples sourced from real code.

1 Like

Since codon actually already has the |> (and ||>) operator, (and is otherwise extremely similar to python) it seems like the most logical place to look for good examples is in codon code bases. Unfortunately I have not been able to find any projects written in codon.

To me, all rhs, rhs_noinject, last, name, unparsed are not easy to remember.
Why not do:

class MyClass:
    def __pipe__(self, fn: Callable, *args, **kwargs):
        return fn(*args, **kwargs)

Where fn is the provided static function and *args and **kwargs are their call args.
For the injection we could replace the placeholder (e.g. _) to be self.

This way the class can iterate over the args or kwargs to find self if there is a need for more complex implementations.

I understand that the explicit placeholder placing is slower to write, but it skips the need to remember which one is implemented.

This also assumes we don’t follow the auto-lambda, so the scope could be reduced.

Also, for this to be more useful, I believe object should implement a naive version of __pipe__ like in the previous example, and ideally, there should be a custom text where an exception is raised during a pipeline

As for non-ml example of real code, this is from a personal project of mine, not sure if it counts, but:

Non ml real code converted

For context, this is a handshake with a new user over using two passwords with the signal protocol

Original:

from datetime import datetime
import logging

import bson

from encryption.add_bundle_to_store import add_bundle_to_store
from encryption.utils.decrypt_with_password import decrypt_with_password

from db.peers import Peers
from db.utils.add import _session_add


async def add_new_peer_bundle(
        session_maker,
        peer_bundle: bytes,
        store,
        *,
        own_password: str,
        other_password: str,
) -> None:
    logging.debug("Adding new peer bundle")

    now = datetime.now()

    own_decrypted_bundle = decrypt_with_password(
        peer_bundle,
        own_password,
    )
    decrypted_bundle = decrypt_with_password(
        own_decrypted_bundle,
        other_password,
    )
    deserialized_bundle = bson.loads(decrypted_bundle)

    new_peer = Peers(
        **deserialized_bundle,
        checked_time=now,
    )

    await _session_add(session_maker, new_peer)

    add_bundle_to_store(store, deserialized_bundle)

Funneled:

from datetime import datetime
import logging

import bson

from encryption.add_bundle_to_store import add_bundle_to_store
from encryption.utils.decrypt_with_password import decrypt_with_password

from db.peers import Peers
from db.utils.add import _session_add


async def add_new_peer_bundle(
        session_maker,
        peer_bundle: bytes,
        store,
        *,
        own_password: str,
        other_password: str,
) -> None:
    logging.debug("Adding new peer bundle")

    now = datetime.now()

    new_peer = (
            decrypt_with_password(
                peer_bundle,
                own_password,
            )
        |> decrypt_with_password(
                _,
                other_password,
            )
        |> bson.loads(_)
        |> Peers(
                **_,
                checked_time=now,
            )
    )

    await _session_add(session_maker, new_peer)

    add_bundle_to_store(store, deserialized_bundle)

I wouldn’t say it is a big improvement.
To me (personal opinion), it makes the code a bit harder to read but makes it easier to understand that the steps to create a new user are to decrypt twice the user data and then deserialize it

RHS is not necessarily a function. Internally it is wrapped in a lambda that takes _ and only _ as parameter, in order to lend itself to treatment by __pipe__() but any *args and **kwargs would have been captured by that time already.

I am not sure I am following this at all.

This is not what is expected of a pipeline operator. The __pipe__() does not control where the injection happens and it happens as the last positional argument to the leftmost call (if present) on the RHS.

We can play with names but what these are, are actually composable pipelines and IMHO are an integral part of a package that goes beyond “syntax sugar”.

??? No clue what this is about. Could you explain with an example?

This is just badly formatted.

async def add_new_peer_bundle(
        session_maker,
        peer_bundle: bytes,
        store,
        *,
        own_password: str,
        other_password: str,
) -> None:
    logging.debug("Adding new peer bundle")

    now = datetime.now()

    new_peer = (
        peer_bundle
            |> decrypt_with_password(password=own_password)
            |> decrypt_with_password(password=other_password)
            |> bson.loads()
            |> Peers(**_, checked_time=now)
    )

    await _session_add(session_maker, new_peer)

    add_bundle_to_store(store, deserialized_bundle)

looks much better. If Peers could take a dictionary (which I think it should, why use kwargs here?) you could even leave out the **_.

Although you can squeeze things onto one line in the given example, you shouldn’t assume that you can do so for an arbitrary (useful) function call.

Keeping lines short is a pretty well established element of readability, with some implications for accessibility as well. (See also: universal design.)

If a function has four or five parameters, you’ll naturally run out of space and it will need to wrap.
What is your preferred formatting style for a pipeline including functions which wrap lines?

In case you’d want to use a different placeholdet other than self (like self.value), the code from __pipe__ could iterate over the args to replace all references to self

Like

class MyClass:
    value: float
    def __pipe__(self, fn: Callable, *args, **kwargs):
        args = (
            arg.value
            if arg is self
            else arg
            for arg in args
        )
       kwargs = {
           kw_key: (
               kw_value.value
               if kw_value is self
               else kw_value
            )
            for kw_key, kw_value in kwargs.items()
        }
        return fn(*args, **kwargs)

I will admit I am not 100% sure what you mean, but in case I did guess correctly:
Peter Suter mentioned a table where they compare many implementations of the operator, most of them having the implicit argument being the first one:

So I believe for a programmer who uses Python as a secondary language it would not be obvious whether the implicit placeholder goes first or last, hence being an option to explore to go with a forced explicit placeholder.

Sorry for the bad wording, I meant that Python’s primitives should implement a minimal __pipe__ so that you can do basic out of the box stuff like 3 |> str # '3'

This is subective. I use linters that enforce PEP8 so more often than not one line calls are not possible for me.

Edit: Now I’ve realized you’ve removed the explicit placeholder. I agree with you that IMO the implicit placeholder looks cleaner, but I also think it can become a bigger learning barrier because of the reason stated before.

Edit 2: I like that you start with peer_bundle |>. I didn’t think about that, and even if it is just a detail, I like it more that way. That I agree it is a bit more convoluted than it should be in my code.

This is the adapted code, so I tried to minimize changes. As I imagine it, those unpacked kwargs would end up in the kwargs of the call of __pipe__.
Also Peers can take a dict because it is a dataclass and the dict is deserialized in the step before.

In the current implementation this is just not necessary. Now I would be hard pressed to find a use case for what you are proposing. With that said you can achieve that by calling ast.parse(unparsed) and handle the RHS manually. The current implementation just does not assume that the RHS is a function call, so there is no place for *args and **kwargs in this logic.

I don’t think other implementations should influence such a decision. So far, to me it seemed more practical and a better fit to existing functions in Python (map, filter) to inject it as the last argument. You can always achieve injection as the first argument by passing all other arguments as keywords. It would not work the other way around. To be perfectly fair though - we should probably chart out more use cases before making a definite decision here.

BTW. the leftmost call on the RHS is for example the call to map() in this example, as opposed to the call to pow() which will not get the injection:

[1, 2, 3, 4, 5] |> map(lambda x: pow(x, 2))

IMHO, a forced placeholder is a downgrade in all possible ways. It doesn’t have to be obvious, especially to newcomers and people with biases, it just has to be properly documented. The meaning of the rules for injection should be obvious - with this I can agree but not that the rules should become known magically by looking at the symbol.

You can already do that (3 |> str(), that is). Please check it out.

It should be obvious then that visually pipeline works best with short calls. I would strive to keep them that way.

Implicit placeholder is practically a defining feature of a pipeline. Forced one is superfluous, annoying to read and distracts from the logic.

:+1:

Got it. Many thanks for the great feedback.

I would consider the same options as for method chaining.

Personally, I think I might lean towards something like:

new_peer = (
    peer_bundle
        |> decrypt_with_password(
            password=own_password)
        |> decrypt_with_password(
            password=other_password,
            one_more_param=one_more_param,
            yet_another_param=still_one_more
            the_last_param=the_really_last_param)
        |> bson.loads()
        |> Peers(**_, checked_time=now)
)

Something a bit more fleshed out and with as little vertical spread as possible - let the indentation do the work. The difference to PEP8 is small but to my eyes it looks WAY better. What would be your preference? PEP8 or something new?

I wouldn’t even make comparisons to or think about PEP8.

IMO it’s important that you think about and formulate an opinion, as the proposer, more so than it is important what in particular it is. Function calls can span multiple lines and if a feature can’t compose “well” with that, it’s dead in the water.
I was concerned by the fact that you fixed an example to be more readable by squishing things into single lines – that you might not have a plan for how multiline calls should look.

I like black’s and ruff-format’s rule that when arguments expand over multiple lines, the trailing paren does so as well.


While we’re on the subject of style, I’ll raise a related item regarding readability:

The removal of an explicit placeholder value for the pipe output is incredibly destructive, to the point that I cannot imagine this proposal succeeding with that behavior included. I assume that if this ever moves to the stage of being a PEP, that would get removed, but it’s so severe that I suspect it’s one of the things which will make it hard or impossible to find a core dev to sponsor this.

The assumption that pipe output should be the first or last argument to a call is a huge one to bake in at the language level, and it basically blows up linting in all kinds of ways since an ast.Call node is not guaranteed to contain a valid suite of arguments to the called function.

Plus it introduces confusion and ambiguities for a human reader when seeing foo().bar() regarding which function gets the implicit argument. I don’t particularly care that there is a clear rule for the interpreter to follow – the human who reads and writes this is now being asked to learn and memorize a special case rule for something as fundamental as calling a function.

I would rather we just bound a name as a local. I think _ is a worse choice than what I suggested back in October of PIPE, but I’d much rather see _ get bound than nothing.

Calls like bson.loads() look for all the world like errors, and if we stick with _ as the bound name, you are literally saving one character for a very, very costly decision.


I think pipelines are a cool tool and, if done right, might make it into the language. But to me that implicit argument passing stuff is a deal breaker, and I’d hate to see a useful feature not advance for that reason alone.

2 Likes

_ conflicts with both existing common patterns for i18n and for unused variables in unpacking. While the latter conflicting might not be a big deal, the former is.

An explicit value sentinel similar to functools.Placeholder would be ideal here rather than an implicitly bound name.

That, or syntax at the beginning of the pipeline that chooses the name of the pipe

3 Likes

Good point. I was thinking that this would be normal name shadowing, but if the name is fixed it plays havoc on the _() convention.

Playing this out, what is the value of that sentinel though?
e.g. PIPE is a name for builtins.PIPE, that’s the sentinel. What happens if I print(PIPE) outside of a pipeline?

But maybe you’re right that the name shouldn’t be implicitly bound. In that case, should it always be explicitly bound?
e.g.

plot = data |>PIPE
  |> parse(PIPE) |> filter_nulls(PIPE) \
  |> normalize(PIPE, scale=(0,1)) \
  |> plot_against(grid, PIPE)
png_render_file(plot, "cool_map.png")

That initial variable name reads a little like a heredoc “EOH”, which is not bad.

1 Like

Fair enough.

Here, this approach does not seem to look as nice as keeping the closing paren on the same line as the last argument. IMHO it makes the whole construct look like a weird, brittle mix of symbols. Maybe the presence of the explicit placeholder here adds to the problem. Too many symbols are made too visible so that they might help someone who has trouble noticing symbols or memorizing conventions but annoys and distracts (from the semantics) someone who is acutely aware of their presence through decades of practice.

Line breaking compared

With _ as placeholder:

new_peer = (
    peer_bundle
        |> decrypt_with_password(
                _,
                own_password
            )
        |> decrypt_with_password(
                _,
                other_password,
            )
        |> bson.loads(_)
        |> Peers(
                **_,
                checked_time=now,
            )
    )

With PIPE as placeholder:

new_peer = (
    peer_bundle
        |> decrypt_with_password(
                PIPE,
                own_password
            )
        |> decrypt_with_password(
                PIPE,
                other_password,
            )
        |> bson.loads(PIPE)
        |> Peers(
                **PIPE,
                checked_time=now,
            )
    )

With PIPE as placeholder and closing paren one line earlier:

new_peer = (
    peer_bundle
        |> decrypt_with_password(
                PIPE,
                own_password)
        |> decrypt_with_password(
                PIPE,
                other_password)
        |> bson.loads(PIPE)
        |> Peers(
                **PIPE,
                checked_time=now)
    )

With PIPE as placeholder and closing paren one line earlier and opening paren on the same line as first arg:

new_peer = (
    peer_bundle
        |> decrypt_with_password
                (PIPE,
                own_password)
        |> decrypt_with_password
                (PIPE,
                other_password)
        |> bson.loads(PIPE)
        |> Peers
                (**PIPE,
                checked_time=now)
    )

As above with parens on separate lines:

new_peer = (
    peer_bundle
        |> decrypt_with_password
                (
                    PIPE,
                    own_password
                )
        |> decrypt_with_password
                (
                    PIPE,
                    other_password
                )
        |> bson.loads(PIPE)
        |> Peers
                (
                    **PIPE,
                    checked_time=now
                )
    )

That is true about the ast.Call node but you would first be parsing the ast.Pipeline node which comes on top and captures the calls and therefore would have an opportunity to handle accordingly (i.e. actually perform the injection OR tag ast.Call for further processing, indicating that it is subject to auto-injection). You could even infer what is being injected from the LHS in the same way any type inference is done currently. Here, the AST of the example above:

The AST
Module(
    body=[
        Assign(
            targets=[
                Name(id='new_peer', ctx=Store())],
            value=Pipeline(
                left=Pipeline(
                    left=Pipeline(
                        left=Pipeline(
                            left=Name(id='peer_bundle', ctx=Load()),
                            right=Call(
                                func=Name(id='decrypt_with_password', ctx=Load()),
                                args=[
                                    Name(id='PIPE', ctx=Load()),
                                    Name(id='own_password', ctx=Load())])),
                        right=Call(
                            func=Name(id='decrypt_with_password', ctx=Load()),
                            args=[
                                Name(id='PIPE', ctx=Load()),
                                Name(id='other_password', ctx=Load())])),
                    right=Call(
                        func=Attribute(
                            value=Name(id='bson', ctx=Load()),
                            attr='loads',
                            ctx=Load()),
                        args=[
                            Name(id='PIPE', ctx=Load())])),
                right=Call(
                    func=Name(id='Peers', ctx=Load()),
                    keywords=[
                        keyword(
                            value=Name(id='PIPE', ctx=Load())),
                        keyword(
                            arg='checked_time',
                            value=Name(id='now', ctx=Load()))])))])

It’s not much different than the convention for arguments without defaults having to come before those with default values in a function declaration OR that positional arguments MUST precede keyword arguments in a function call. The whole language is a set of rules - some more special than others. Here, it is obvious and instantaneously clear that the injection happens to the leftmost call (i.e. foo()). We could run a competition with all sort of expressions and I can literally indicate to you in <0.5s for each case where the injection happens. With that said, if this was truly to be a blocker, I’d be fine with letting it go. My impression however is that use cases are the most burning issue for the core devs.

The current implementation DOES bind _ - it does NOT change anything about _ at the parser level. I haven’t seen _( in any of the pipeline examples so far so I don’t see how it would interfere with i18n detecting those. Do you mean using i18n in the pipelines? If so, it’s not any different than currently using _( elsewhere where _ is assigned to.

Out of a context, maybe. But preceded by |> it’s clear where the seemingly missing argument comes from. Realistically, how long such a state of confusion can last in a programmer when they are told what |> means?

That’s great to hear. I am of the same opinion and I am sure we can find a middle ground. It would be easier if the PEP could contain alternatives for the core devs to consider rather than settle on choices made under assumptions as to what would be tolerable for the core devs. Keeping the alternatives in Rejected Ideas is not good enough, as it suggests that they were rejected by the community, whereas we would be rejecting the implicit placeholder in large part for the sake of the core devs if I understand correctly?

That’s exactly the assumption we are making. Do we KNOW that it would not be advancing because the core devs do not approve of this feature? Or is it really the majority of the community who wants an explicit placeholder? Most (almost all?) languages (R, Julia, F#, Elixir, Ocaml, etc.) use an implicit placeholder. It is very much a defining feature of what is meant by a “pipeline” these days. It saves a lot of typing and redundancy. An explicit placeholder is something relatively unique and IMHO would have to add tangible value to justify deviating from a proven (and preferable to some) canon. Are there any use cases other than subjective clarity to this approach? Can you do something with an explicit placeholder that you cannot do without one?

BTW, to be clear. You still CAN use an explicit placeholder. 1 |> print() and 1 |> print(_) are doing the same thing in the current implementation. It’s when the explicit placeholder is MISSING that the injection behavior kicks in. Does that help?

Rejected Ideas is for things rejected by the PEP authors.

Authors have to lead the discussion, and ideally also be receptive to feedback.

“The community” is a nebulous thing. Nobody can know that “the community” wants. We debate from our various perspectives and try to capture that in a document. Most of the user community isn’t even in the discussion – we just do our best.

I don’t know all of these particular languages, but several of them (e.g., Julia) emphatically do not do any implicit argument injection. The consumers of pipelines are uncalled functions, and they are invoked on pipe output.

It’s not 1 |> print(), but 1 |> print.
The difference between those two is very significant.

_ is classically the name given to the internationalization function in many projects. So that particular name has a special meaning and preventing its use would be problematic.


I think I’ve said my piece at this point. I feel like it’s not landing, which is disappointing, given that a case which I think is inherently ambiguous is being called “obvious and instantaneously clear”.

I’ll just leave a few examples of nasty cases here with the note that the point is not “how clear is it to a subject matter expert” but instead “how clear is it to a novice”.

x |> (f(), g())
x |> f(g())
x |> f(g)()
x |> f()()
x |> (f := g())()
x |> [f() for f in funcs][0]()
x |> y[f()](g)()
x |> (f() * g(),)
x |> f() if g() else h()

I think that might be all of the help I can offer. I hope you can refine this to a point where it can advance.

2 Likes

+1

This is a good argument. For anyone curious who like me didn’t know about the 18n one, the gettext api for i18n injects ‘_’ when translating i18n strings

I can get behind a predefined PIPE, but this could be onto something. I think having it be a ‘fake step’ is not the best option IMO. Since the currently discussed operator has two char, I thought about having the variable name in between (like functions with parenthesees) or adding a third, but I think it could be confused with other operators. (data | PIPE > parse(PIPE) and data |> PIPE > parse(PIPE) the lower than, data | PIPE |> parse(PIPE) the or, data ~ PIPE |> parse(PIPE) the bin not) Maybe the closest could be something like data $ PIPE |> parse(PIPE), which has a bit of precedence in string.Template. This could have the option to ‘rename’ or to have multiple temporary variables, if we deem it readable and useful:

plot = data $ PIPE |>
  |> parse(PIPE) |> filter_nulls(PIPE) \
  |> normalize(PIPE, scale=(0,1)) \
  |> plot_against(grid, PIPE) $ RESULT
  |> fake_step(PIPE, RESULT)
png_render_file(plot, "cool_map.png")

Though I find it less readable IMO

It’s been my feeling as well.

I don’t think the operator is a bad idea, but definetly is not easy to make right, specially since I agree with you it can be very ambiguous in many of the parts of the theoretical operator.

1 Like

A PEP ultimately has to propose a single alternative. The purpose of the ideas thread is to engage with as many members of the community (including core devs) and converge on a single proposal that the community can stand behind. But I feel as though in this case, the discussion is more about you defending your vision of what the feature should look like, rather than adapting to feedback. That’s fine, if you’re sufficiently sure that your vision is the right solution, but it does increase the risk that the SC won’t agree with you (simply because there’s already evidence that people have different views, and the SC are no different in that regard).

Not in the slightest. We would be rejecting the implicit placeholder because that’s the choice that the comminity feels is best for Python. It may not be your preference, but designing a new language feature is always an exercise in compromise.

As a community member, I find @sirosen’s arguments against the implicit placeholder to be compelling - Python does not have partial function application like functional languages do, so the implicit placeholder doesn’t fit naturally with other language structures like it does in functional languages. That’s a personal view though, not some sort of proclamation of my “core developer opinion” - there’s no core dev gatekeeping going on here, just a community trying to put together a proposal[1].

Not at all. The core devs have no say in this, beyond the fact that a PEP needs one core dev to sponsor it. What matters is the SC’s decision, ultimately, and one of the factors they will take into account is the community consensus around the proposal. So at the moment, the biggest concern for the acceptance of the PEP is that you’re not managing to get community consensus that your proposed design is the right approach (specifically around implicit argument passing, but possibly in other areas as well).

While I’m commenting, I’ll pick up on this point. First, it’s not about “for core devs” (again). It’s about the fact that having clear use cases is what makes for a good proposal. You’re not “ticking boxes” on some sort of checklist here, you’re being given advice and trying to follow it. There’s lots of things that make for a good proposal. Use cases are one, consensus is another. You need to address them all.

And second, the use cases you’ve added to the PEP still seem weak to me. I’ve only skimmed them, so I may be missing something, but they are great examples of the power of the proposal, which is a good start, but they don’t seem to include any examples of real world code that would be improved by the proposal. To give an example, use case 6 (picked completely at random):

        a, b, c, d, e = (None,) * 5
        extract(self) |> (a, b, c, d, e)
        a += 1
        b += 2
        c += 3
        d += 4
        e += 5
        update(self) |> (a, b, c, d, e)

I can’t imagine ever bothering to rewrite code that is currently written:

        self.a += 1
        self.b += 2
        self.c += 3
        self.d += 4
        self.e += 5

in that form, especially if I had to develop and maintain the extract and update classes that are included in the example. So this use case doesn’t give me any intuition of where real-world code would benefit from the addition of this feature. Sure, it’s a neat example of the power of the pipeline operator, but power without applications is wasted.

That’s my current view as well. I like the idea of a powerful pipelining operator, but it’s hard to come up with a design that feels natural, addresses real needs, and doesn’t interact badly with existing conventions and coding styles. I wish we could make progress here, but as long as the discussion is taking the form of @sadaszewski defending the existing design against concerns from community members, I think we’re at a bit of an impasse :slightly_frowning_face:


  1. The only difference that my being a core developer makes is that I might have a better insight into what a successful proposal looks like, just from the fact that I’ve had more experience following the PEP process than many people have ↩

9 Likes

Totally agreed with Paul. In theory, pipelining is useful and fun. I’m big fan of Linux shell pipes. I use them a lot nearly every day. And I love Python’s syntax and design.

But the current proposal for Python pipes doesn’t feel viable. And I’m strongly -1 for it.

2 Likes

Thank you for your kind feedback. I am grateful for the revived interest and the dedicated time.

OK let’s gather the community feedback on the following issues:

  1. Right-hand side
  • Only a call, e.g. func()
  • Only a callable, e.g. func
  • An arbitrary expression, e.g. [ x ** 2 for x in PIPE ]
0 voters
  1. Placeholder
  • Implicit (injected) placeholder when explicit placeholder absent, i.e. func() gets the injection treatment whereas func(_) doesn’t
  • Implicit (injected) placeholder always, i.e. func(_) gets an extra _ anyway and is transformed into func(_, _)
  • Predefined explicit placeholder, e.g. _ or PIPE mandatory on the right-hand side
  • Configurable explicit placeholder, e.g. (PIPE := x) |> func(PIPE) |> func2(PIPE) (the named expression becomes mandatory on the left-hand side and defines the variable that will be auto-updated)
0 voters
  1. If predefined and/or default placeholder then what symbol would you choose among these two?
  • _
  • PIPE
0 voters
  1. If configurable placeholder then which way to configure it among these?
  • (PIPE := x) |> func1(PIPE) |> func2(PIPE)
  • x |PIPE> func1(PIPE) |> func2(PIPE)
  • (x as PIPE) |> func1(PIPE) |> func2(PIPE)
  • x $ PIPE |> func1(PIPE) |> func2(PIPE)
0 voters
  1. If the injection behavior is kept, at which position should it happen?
  • As the first positional argument
  • As the last positional argument
0 voters

Let’s keep the selection narrowed down to these choices as an exercise in finding alignment? Does that make sense? Looking forward to the results - I hope there are many votes. :crossed_fingers:

2 Likes