Uniform Function Call Syntax (UFCS)

petercordia · August 13, 2024, 7:59pm

Uniform Function Call Syntax allows you to write
x.f(*args, **kwargs) instead of f(x, *args, **kwargs).

I do recognise that it’s not going to get implemented in the manner above, because it would be a breaking change (people are bound to have (ab)used the pattern try: object.method(); except AttributeError: do_stuff() ), and because it would make it too easy to write awful unreadable code (because if you see x.f() it could be any of object_attribute, class_method or actual_function, in addition to possible definitions in an inheritance hierarchy, there’d often be too many places to look).

But there are a lot of use cases where UFCS would be really nice.
It’s not just me that thinks so. That’s why scala, nim and D have implemented UFCS, and there are 2 projects on github that implement UFCS for python. (Uniform Function Call Syntax (UFCS) for Python · GitHub and GitHub - witer33/justmagic).

UFCS enables a programming pattern which I really like that looks something like this:

def public_function(arg1, arg2, arg3):
  return load_thing(arg1)
    .first_transformation()
    .second_transformation(arg2)
    .third_transformation(arg2, arg3)
    .fourth_transformation()

which looks a lot neater to me than either

def public_function(arg1, arg2, arg3):
  tmp = load_thing(arg1)
  tmp = tmp.first_transformation()
  tmp = second_transformation(tmp, arg2)
  tmp = third_transformation(tmp, arg2, arg3)
  tmp = tmp.fourth_transformation()
  return tmp

or

def public_function(arg1, arg2, arg3):
  return third_transformation(
    second_transformation(
      load_thing(arg1).first_transformation(),
      arg2
    ),
    arg2, arg3)
  ).fourth_transformation()

additionally, it is a small annoyance to me every once in a while that I have to write len(thing) instead of the more natural thing.len().

There’ve also been two request on the forum in the last month that could have been eased by UFCS:
Allowing parameters in-between function names, who would have been able to express their function as this_set.has_subset_in(set_list). And
Prefix shorthand for sending args by using `->`, who proposed using the symbols x -> f instead of x.f() which doesn’t feel as intuitive to me, but there’s the same desire to write functions after their primary argument (as you do with methods) rather than writing them before.

questions

supposing we couldn’t change the behaviour of ., but we could elect another symbol, what symbol do you think would work well? My best idea is .. but that’s a little ugly. ! is too shouty, : could lead to ambiguity, .: is a little funny…
let’s call the symbol ▼ for now, which I don’t think is a valid python symbol. What would you think of

def public_function(arg1, arg2, arg3):
  return load_thing(arg1)
    .first_transformation()
    ▼second_transformation(arg2)
    ▼third_transformation(arg2, arg3)
    .fourth_transformation()

?

Would it be preferable if x▼f() evaluates to x.f() if x has a method f, or if it always evaluates to f(x)? The latter might be simpler to work with, but the former would allow for lovely clever trickery.

I would use (pseudo-)UFCS if it was available in Python
I would not use (pseudo-)UFCS if it was available in Python
I would hate other people to be able to use (pseudo-)UFCS if it was available in Python

0 voters

dg-pb · August 13, 2024, 9:32pm

I fell into the poll-trap, so now I need to reply.

I would most likely use it, but I don’t think that this needs to be part of python.

from functools import Placeholder, partial

class P:
    def __init__(self, func, *args, **kwds):
        self.func = partial(Placeholder, func, *args, **kwds)

    def __ror__(self, other):
        return self.func(other)

import operator as opr
def public_function(arg1, arg2, arg3):
    return (arg1 | P(opr.add, arg2)
                 | P(opr.sub, arg3))


print(public_function(1, 2, 3))    # 0

yoavdw · August 13, 2024, 9:48pm

I’m not going to comment on the idea itself, at least not right now, but:

Generally, I don’t think ideas like this benefit from a poll. It provides limited options with no nuance, generalizing very different opinions in the same bucket - creating an inaccurate image of people’s actual opinions on the matter.

In addition, the poll being at the start of the thread makes it a lot easier for people to vote and then not actually read the rest of the discussion - which I think is not healthy on changes this big, where all the advantages/disadvantages are not always immediately clear.

petercordia · August 13, 2024, 10:11pm

So far I’m very happy with the poll. @Rosuav was able to let know that he absolutely hates the idea, and it’s actually for a reason I respect rather than “It’s not worth it” or “It’s too hard to implement”, and it didn’t cost him any time at all. And it doesn’t make me feel bad, whereas if he had typed out a poorly thought-out argument about why he dislikes it (as one does when one is in a rush) it would have made me feel bad.

@dg-pb meanwhile thinks it’s not worth implementing, but he’d still use it if it was implemented. That’s something that makes me happy to know.

This is an unfinished idea, not a PEP. The poll only says what the poll says, it’s not supposed to be a vote about whether this feature should be implemented. It’s just gathering information that I think is genuinly useful.

petercordia · August 13, 2024, 10:11pm

It might even be possible to find a way to bind it to .. or |. rather than | P(...) which are options I’d find satisfactory for private use. Unfortunately that wouldn’t come with IDE integration, and I wouldn’t dare use such custom syntax for professional use where my colleagues have to be able to read it too. I recognise that these desires are in themselves significant costs that would be hoisted onto IDE developers and software developers. That’s why I’m trying to have a conversation about how useful various people would find (pseudo-)UFCS, and not trying to convince people it’s a good idea.
The class P you propose would work (cumbersomely), but could you actually imagine it being used in a public library?
Whereas if it was part of the default syntax, I think it would be used.

NeilGirdhar · August 13, 2024, 10:16pm

There are a lot of recent ideas that exhibit this argument “wouldn’t it be nice if we could write code like this also”. But code is easier to read the less varied the number ways it could be written. Alternatives are not a benefit. “There should be one—and preferably only one—obvious way to do it.”

It would be different if a pattern is significantly clearer or simpler, mitigates errors, or is more expressive. And you can dig through recent additions (e.g., the match statement, which is more expressive; or the generalized unpacking, which is simpler than its alternative) to see examples of that.

Also, calling every intermediate variable tmp is a missed opportunity to give them meaningful names that serve as self-commenting code.

Rosuav · August 13, 2024, 10:17pm

That’s not exactly what I was expressing, but here’s the thing: You’re free to interpret the poll results ANY WAY YOU LIKE because there simply isn’t enough information in them.

In actual fact, I posted that poll answer more by accident than anything else, but I can’t find a way to retract a vote, so it stands.

Polls suck.

petercordia · August 13, 2024, 10:39pm

that’s such a thought-stopper

There was a way to merge dicts: dict(**dict_one, **dict_two). Now we have dict_one | dict_two. Has this broken the principle? Arguably; dict(**X, **Y) is still viable sometimes. Yet I think it’s still an improvement.

There are already two ways of writing the composite function. The imperative style

def public_function(arg1, arg2, arg3):
  tmp = load_thing(arg1)
  tmp = tmp.first_transformation()
  tmp = second_transformation(tmp, arg2)
  tmp = third_transformation(tmp, arg2, arg3)
  tmp = tmp.fourth_transformation()
  return tmp

and the functional style

def public_function(arg1, arg2, arg3):
  return third_transformation(
    second_transformation(
      load_thing(arg1).first_transformation(),
      arg2
    ),
    arg2, arg3)
  ).fourth_transformation()

both in my current workplace, and in AI where I worked previously, the imperative style is common. True, we don’t call the variable tmp. It’s usually net or df or answer or something else. The name doesn’t really matter to me. Recycling the name is a common pattern.

There’s a conflict because on the one hand functional programming is better than imperative programming, but on the other hand this particular way of functional programming is a lot harder to read than the imperative alternative. I think that might be why the convention is to recycle the name of the temporary variable. Because we do think about it as functional programming, even if that’s now what we write.

My proposal would not add yet another way to write this correctly, because the old functional style would become incorrect. One could argue the old imperative way of writing it would also be deprecated. In which case UFCS would cause there to be One Obvious Way to write a composite function, where now there are 2.

dg-pb · August 13, 2024, 10:46pm

Personally, I am pretty happy with this.

ZeroIntensity · August 13, 2024, 10:57pm

partial is an interesting way to implement this, but traditionally, chained methods in Python are implemented like such:

# This isn't a PEP 8 name -- ignore that, I'm just replicating the example
class load_thing:
    def __init__(self, arg: Any) -> None:
        self.arg = arg

    def first_transformation(self) -> Self:
        # ...
        return self

    def second_transformation(self, arg1: Any) -> Self:
        # ...
        return self

    def third_transformation(self, arg2: Any, arg3: Any) -> Self:
        # ...
        return self

    def fourth_transformation(self) -> Self:
        # ...
        return self

I find this trivial enough, I think a syntax change could be encouraging the wrong behavior!

bwoodsend · August 13, 2024, 11:51pm

Presumably that was done for symmetry with set(). I’d have voted against the new way if I had the chance.

For me, it’s three things:

I never want to lose the visual distinction between a method belonging to a given class and a random function.
Based on experiences in Java (where this style of writing is trendy) I can tell you that chaining multiple statements into one statement ruins diagnostics. If third_transformation() failed then the stacktrace will point to the return load_thing(arg1) line. If you’re using a debugger and want to put a breakpoint between second_transformation() and third_transformation(), you can’t because they’re the same statement and even share a line number as far as the interpreter is concerned,
This style encourages people to write functions which modify in-place then return their input just to avoid breaking method chains. The return value makes you wrongly assume that you’re getting a modified copy which of course is a confusing logic bug waiting to happen.

NeilGirdhar · August 13, 2024, 11:56pm

This is a great example. I’m glad you brought it up. dict(**x, **y) is not a good way to merge mappings in general: it requires string keys, raises on duplicates, doesn’t respect the types of the mappings, and cannot be overridden.

x | y is the nearly always the right way to merge mappings. It’s easier to read, idiomatic, and nearly always does what you want.

You’re welcome to think that, but I don’t think it’s true when writing Python code. Idiomatic Python is imperative. That is the “obvious” way to do it.

I understand your dream of changing the Python world to functional programming, but I personally think it’s undesirable and unrealistic.

dg-pb · August 14, 2024, 12:14am

What about {**X, **Y}?

NeilGirdhar · August 14, 2024, 12:19am

That doesn’t respect mapping types and can’t be overridden. I think x | y is the idiomatic approach.

elis.byberi · August 14, 2024, 12:25am

Peter Gerlagh:

def public_function(arg1, arg2, arg3):
  return load_thing(arg1)
    .first_transformation()
    .second_transformation(arg2)
    .third_transformation(arg2, arg3)
    .fourth_transformation()

That would make sense if all functions are applied to a single object. Otherwise, it would be difficult to keep track of which function accepts which variable type.

That being said, you can just pass the object reference:

def public_function(arg1, arg2, arg3):
    load_thing(arg1)
    first_transformation(arg1)
    second_transformation(arg1, arg2)
    third_transformation(arg1, arg2, arg3)
    fourth_transformation(arg1)

BrenBarn · August 14, 2024, 1:53am

One question here is what happens if you do the attribute access but not the function call. This is legal Python:

func = obj.attr
func()

If obj is some arbitrary object and I do func = obj.print, what is func? Is it some object that’s like a bound method or functools.partial object? There’s probably a reasonable way to do it, but the point is that x.f(*args, **kwargs) isn’t really “function call syntax” in Python; it’s two separate operations, an attribute lookup and then a function call. So any proposal like this has to explain what happens for both parts.

Aside from that, I don’t see a huge benefit to this. You can already do method-chaining if the objects are written to support it (i.e., they return self or some appropriate new object that supports the next chaining operation). If they’re not written to support it, UFCS only helps if they’re written to support taking the result of one method as the first argument of the next call. So the functions in the chain still have to support it in some sense, by having the right argument in the first position.

ajoino · August 14, 2024, 7:23am

OoT, but what do you mean with “it cannot be overridden”?

Python 3.12.4 (main, Jun  8 2024, 18:29:57) [GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a = {"a": 1, "b": "string"}
>>> b = {"a": True}
>>> a | b
{'a': True, 'b': 'string'}
>>> {**a, **b}
{'a': True, 'b': 'string'}

afaik they are equivalent (the union syntax is actually equivalent to calling update on a copy of a according to PEP 584, things could have changed since then), but maybe some type checkers can’t deal with the “old union” syntax. But that is a type-checker problem and not a problem with the syntax per se.

More in line with the discussion here, PEP 584 gives rationale why the dict union syntax is an improvement, which more or less are that the existing ways of merging dicts is either obscure (dict.update), modifies a dictionary in place (dict.update again) which requires some extra code which cannot be expressed in a single expression, or it’s not obvious ({**a, **b}) what it does. I’m not sure UFCS would be an improvement to the language since Python has methods which you can use to accomplish the same thing, and personally I dislike the method-chaining style of code so I wouldn’t use it if it was added.

blhsing · August 14, 2024, 7:41am

Peter Bierma:

traditionally, chained methods in Python are implemented like such:

# This isn't a PEP 8 name -- ignore that, I'm just replicating the example
class load_thing:
    def __init__(self, arg: Any) -> None:
        self.arg = arg

    def first_transformation(self) -> Self:
        # ...
        return self

    def second_transformation(self, arg1: Any) -> Self:
        # ...
        return self

    def third_transformation(self, arg2: Any, arg3: Any) -> Self:
        # ...
        return self

    def fourth_transformation(self) -> Self:
        # ...
        return self

I find this trivial enough

This is how it should be done if anyone prefers this style.

FWIW the Django QuerySet API is written in exactly this way, which I like.

entwanne · August 14, 2024, 8:07am

If you have 2 instances a and b of a custom mapping class, {**a, **b} would produce a dict whereas you could override __or__ to have a | b producing your custom type.

ajoino · August 14, 2024, 8:20am

Ok that makes sense, I was thinking in terms of overriding values in the first dict.