Offer filter, map, etc. as methods on iterables

It would be nice to be able to call many common functions that work on collections as methods on those collections.

ticket_ids = (
    some_list_or_set
    .filter(lambda x: x.startswith("SPARK-"))
    .map(lambda x: x.removeprefix("SPARK-"))
    .map(lambda x: int(x))
)

I don’t know how well this idea fits Python overall, as it’s inspired by Apache Spark’s DataFrame API (which was, in turn, inspired by the DataFrame concept in R and Pandas).

The main benefit of being able to call methods like these on iterables is that it lets you chain multiple methods together in a way that’s easy to read, where the output of one method is piped to the input of the next method in the chain. I have seen this style of API called a fluent interface. It’s very common for those of us working with Apache Spark or similar data processing libraries.

Of course, you can recreate the above example today just fine with list comprehensions or with the existing filter and map functions. These methods would help more when you have something a bit more complex.

results = (
    some_input
    .map(...)
    .filter(...)
    .distinct(...)
    .reduce(...)
)

Some obvious reasons not to add methods like these to Python’s iterables:

  • One obvious way: Though method chaining is a handy way to express complex manipulations on collections, perhaps most Python users would just find it a confusing addition to the existing stand-alone functions and comprehensions.
  • Mutability: The libraries and languages I’m familiar with that use fluent interfaces usually also default to immutable data structures. Perhaps method chaining is a bad combination with mutable collections, though I confess I haven’t thought much about how that might be.
  • Anonymous functions: If we add a feature like this to Python, pretty soon after people will be asking for a more flexible lambda so they can express logic that spans multiple lines. I don’t recall the history, but I believe the Python community has rejected this idea in the past.

Has this idea been discussed before? What do people think?

Yes, this has been brought up many, many times. Maybe someone else has a good specific link. Suffice to say, this is definitely not going to happen.

I would be happy to read the prior discussion(s). I searched the forum before posting and didn’t find anything relevant. The suggested topics that show up on the right as you compose a new thread also didn’t show me anything relevant.

This forum is relatively new. You’ll want to look through decades of mailing list archives.

One of the biggest reasons is that “iterable” isn’t a class, so you can’t simply add methods to it. It’s a protocol that’s defined by one single method: an __iter__ method that returns an iterator.

If you want ths sort of fluent intreface, one option would be to create a wrapper that takes any iterable and returns an object with all the methods you want. However, each of those methods would have to manage the fluent interface, rather than relying on the individual iterables.It wouldn’t be TOO hard, but it would be something up to you for your particular purposes, rather than being a good fit for the language itself.

2 Likes

Ah, right, I forgot about the mailing lists. :sweat_smile:

Looks like this was discussed as recently as 2021: Mailman 3 Enhancing iterator objects with map, filter, reduce methods - Python-ideas - python.org

And it was a long discussion (94 comments). But that’s exactly what I was looking for.

One library mentioned in that thread that offers the kind of collection methods proposed here is PyFunctional.

There are many other interesting ideas and libraries discussed in that thread that are worth reviewing for anyone interested in this topic.

The decorator operator elegantly resolves this issue.

from decotools.builtins import map, filter, list, print  # These are functions applied with smart_partial. Reference: https://discuss.python.org/t/decorator-operator/41036/17?u=ilotoki0804

ticket_ids = ['SPARK-234', 'HELLO-123', 'SPARK-9405', 'PY-345']
ticket_ids @= filter(lambda x: x.startswith("SPARK-"))
ticket_ids @= map(lambda x: x.removeprefix("SPARK-"))
ticket_ids @= map(int)
ticket_ids @= list
print(ticket_ids)  # Output: [234, 9405]

If you are interested, it might be worthwhile to check out the discussion.

1 Like

To the extent that it elegantly resolves anything.

Yes, including the distinct lack of support that the proposal had, and the objections to calling it anything to do with “decorators”. It is a function application operator, not a decorator operator.

Although I hadn’t been able to mention it as it’s still under composition, I’d like to take this opportunity to say that I’ve abandoned the term ‘decorator’.