Add `key` argument to `tuple.index()`

grigoriev-semyon · January 6, 2024, 9:03pm

It would be useful to search a tuple in a functional-style manner

# tuple with structured data
data = ((1, 2), (2, 3))

# find index of 2 in classic manner:
for idx, item in enumerate(data):
    if item[1] == 2:
        break

# functional-style search:
idx = data.index(2, key=lambda i: i[1])

This manner is same to list.sort for example

data = [(1, 2), (2, 3)]
data.sort(key=lambda g: g[1])

I think that in some cases this will speed up Python scripts and improve code readability

GH Issue

Implementation

pf_moore · January 6, 2024, 9:58pm

Is the proposal to add this feature just to the tuple type, or to lists as well? Or to make it required for all sequence types (by adding it to the Sequence ABC)?

Making it required for all sequences would break a lot of code. And conversely, making it only apply to tuples (or even tuples and lists) would make it rather useless, as a lot of code is written to accept generic sequences, either via duck typing, or via explicit type annotations accepting Sequence values.

I’m -1 on this - it doesn’t add enough benefit to be worthwhile, and if it’s not added to the general sequence protocol, I’d avoid it in favour of more general code even if it was available on built in types.

grigoriev-semyon · January 6, 2024, 10:07pm

There is a list issue too.

I would like to implement this only for tuples

I think only a small part of production code uses the Sequence ABC interface. Basically, in the production code I see e.g tuple annotations but not sequence

But in general, we can implement this for all sequences

Stefan2 · January 6, 2024, 10:36pm

Seems odd to enhance tuple.index. Have you ever used that at all? I don’t think I have. And I have used list.index many times. I think tuples are usually not used for data where you’d want to search for an index.

(But if we do get this, or get this for lists, I might suggest special-casing key=id so we get an ultra-fast search with just pointer comparisons, no object comparisons. That might btw also be good for some heapremove use cases, to optimize the linear search time.)

cameron · January 6, 2024, 11:04pm

We can write this:

 idx = [ item[1] for item in data ].index(2)

and I’d think it’d be easy to construct a lazy version with enumerated
and a generator expression.

Cheers,
Cameron Simpson cs@cskk.id.au

jeff5 · January 7, 2024, 9:14am

And also, the rule of thumb is that lists are for elements of homogeneous type (must be, because you can insert or sort them), while tuples structure mixed data (like a C struct). I really think the outer structure in the motivating example is logically a list, and should have been.

However, my “like” on your post only extends to the first part.

I think it would be a source of confusion for users if the comparison function were magically to switch from __eq__ to is. I’m not even sure a method could tell when this is safe.

If you needed it to be fast for value keys you have built an index, making the outer container a dict. This API is conceived for elegance not speed.

Stefan2 · January 7, 2024, 10:42am

Why would it be confusing and magically? Remember I’m talking about the users explicitly asking for that by passing key=id. Who would ask for that and then be confused to get what they asked for?

jeff5 · January 7, 2024, 2:46pm

Maybe I misunderstand the intent. Literally id, as in the built-in function? But used to modify the semantics … to what equivalent lambda expression or for-loop? Not I think the same as it would mean without special treatment.

MegaIng · January 7, 2024, 2:51pm

Assuming the value x can be found in tup, tup.find(x) is the same as next(i for i, val in enumerate(tup) if val == x)

By definition of id, id(a) == id(b) if and only if a is b. So tup.find(x, key=id) would mean the example same thing as next(i for i, val in enumerate(tup) if val is x).

I am still not in favor of adding key to the find function, but this operation could actually be useful.

xitop · January 7, 2024, 3:02pm

If this functionality will be ever added to Python, I would expect a broadly usable solution, i.e. a complete set of utilities working with any sequence, e.g.: find and rfind; index and rindex with optional start and end arguments:
index(predicate, sequence [,start [,stop]])

jeff5 · January 7, 2024, 3:16pm

So in English: index in the tuple of the object I am already holding. That might be useful, but I think it is tup.find(id(x), key=id).

Unless, that is, the use of id as the key function changes the handling of the first argument (magically).

MegaIng · January 7, 2024, 3:20pm

Right, yes, this is an amazing counterpoint to adding key to find: I implicitly read it as also applying to the thing to look for, which conceptually matches my understanding of what key does in stuff like sort.

Stefan2 · January 7, 2024, 3:56pm

Yes. Sorry that was unclear. I meant the optimization just as an implementation detail, not something deviating from the general suggestion of this topic.

Stefan2 · January 7, 2024, 4:04pm

sort doesn’t have that issue, as you don’t give it an extra value. The OP’s example data.index(2, key=lambda i: i[1]) makes it clear the key function wouldn’t be applied to the search value. It’s like the bisect functions, where bisect(a, x, key=...) doesn’t apply the key function to the x value.

kknechtel · January 7, 2024, 10:15pm

Here’s an itertools recipe off the top of my head:

from itertools import groupby

def keyed_index(iter, key, value):
    groups = groupby(map(key, iter), value.__eq__)
    try:
        does_match, elements = next(groups)
    except StopIteration:
        raise ValueError('not found') from None
    if does_match:
        return 0
    # have to do this before the next() call
    result = len(list(elements))
    try:
        next(groups)
    except StopIteration:
        raise ValueError('not found') from None
    return result

Stefan2 · January 7, 2024, 10:33pm

That looks rather awful :-P. Especially compared to

from operator import indexOf

def keyed_index(iterable, key, value):
    return indexOf(map(key, iterable), value)

Stefan2 · January 7, 2024, 10:40pm

And if that mixed data contains elements of different types, having a key function seems particularly odd, as that function would have to handle inputs of different types.

Stefan2 · January 7, 2024, 10:44pm

Fun: for keyed_index(((1.0, 2.0), (2.0, 3.0)), lambda i: i[1], 42) you return index 0 instead of raising a “not found” error.

kknechtel · January 7, 2024, 11:34pm

Ah, __eq__ won’t work like that with heterogeneous types, I keep forgetting. I’m also surprised that operator.indexOf apparently works with lazy iterators. Nice find, even better than an enumerate trick

Topic		Replies	Views
Improve the API of list and tuple Ideas	6	696	April 19, 2021
Add a `find()` builtin function Ideas builtins	5	722	February 11, 2022
How to Implement Positional Type Hinting for Iterables in Python? Python Help typing	2	379	September 4, 2023
Add tuple support to more str functions Ideas	17	563	April 24, 2024
Type checking with sorted `key=` Python Help help	2	919	May 7, 2022

Add `key` argument to `tuple.index()`

Related Topics