Add `key` argument to `tuple.index()`

It would be useful to search a tuple in a functional-style manner

# tuple with structured data
data = ((1, 2), (2, 3))

# find index of 2 in classic manner:
for idx, item in enumerate(data):
    if item[1] == 2:
        break

# functional-style search:
idx = data.index(2, key=lambda i: i[1])

This manner is same to list.sort for example

data = [(1, 2), (2, 3)]
data.sort(key=lambda g: g[1])

I think that in some cases this will speed up Python scripts and improve code readability

GH Issue

Implementation

1 Like

Is the proposal to add this feature just to the tuple type, or to lists as well? Or to make it required for all sequence types (by adding it to the Sequence ABC)?

Making it required for all sequences would break a lot of code. And conversely, making it only apply to tuples (or even tuples and lists) would make it rather useless, as a lot of code is written to accept generic sequences, either via duck typing, or via explicit type annotations accepting Sequence values.

I’m -1 on this - it doesn’t add enough benefit to be worthwhile, and if it’s not added to the general sequence protocol, I’d avoid it in favour of more general code even if it was available on built in types.

There is a list issue too.

I would like to implement this only for tuples

I think only a small part of production code uses the Sequence ABC interface. Basically, in the production code I see e.g tuple annotations but not sequence

But in general, we can implement this for all sequences

Seems odd to enhance tuple.index. Have you ever used that at all? I don’t think I have. And I have used list.index many times. I think tuples are usually not used for data where you’d want to search for an index.

(But if we do get this, or get this for lists, I might suggest special-casing key=id so we get an ultra-fast search with just pointer comparisons, no object comparisons. That might btw also be good for some heapremove use cases, to optimize the linear search time.)

1 Like

We can write this:

 idx = [ item[1] for item in data ].index(2)

and I’d think it’d be easy to construct a lazy version with enumerated
and a generator expression.

Cheers,
Cameron Simpson cs@cskk.id.au

1 Like

And also, the rule of thumb is that lists are for elements of homogeneous type (must be, because you can insert or sort them), while tuples structure mixed data (like a C struct). I really think the outer structure in the motivating example is logically a list, and should have been.

However, my “like” on your post only extends to the first part.

I think it would be a source of confusion for users if the comparison function were magically to switch from __eq__ to is. I’m not even sure a method could tell when this is safe.

If you needed it to be fast for value keys you have built an index, making the outer container a dict. This API is conceived for elegance not speed.

Why would it be confusing and magically? Remember I’m talking about the users explicitly asking for that by passing key=id. Who would ask for that and then be confused to get what they asked for?

Maybe I misunderstand the intent. Literally id, as in the built-in function? But used to modify the semantics … to what equivalent lambda expression or for-loop? Not I think the same as it would mean without special treatment.

Assuming the value x can be found in tup, tup.find(x) is the same as next(i for i, val in enumerate(tup) if val == x)

By definition of id, id(a) == id(b) if and only if a is b. So tup.find(x, key=id) would mean the example same thing as next(i for i, val in enumerate(tup) if val is x).

I am still not in favor of adding key to the find function, but this operation could actually be useful.

2 Likes

If this functionality will be ever added to Python, I would expect a broadly usable solution, i.e. a complete set of utilities working with any sequence, e.g.: find and rfind; index and rindex with optional start and end arguments:
index(predicate, sequence [,start [,stop]])

So in English: index in the tuple of the object I am already holding. That might be useful, but I think it is tup.find(id(x), key=id).

Unless, that is, the use of id as the key function changes the handling of the first argument (magically).

1 Like

Right, yes, this is an amazing counterpoint to adding key to find: I implicitly read it as also applying to the thing to look for, which conceptually matches my understanding of what key does in stuff like sort.

1 Like

Yes. Sorry that was unclear. I meant the optimization just as an implementation detail, not something deviating from the general suggestion of this topic.

1 Like

sort doesn’t have that issue, as you don’t give it an extra value. The OP’s example data.index(2, key=lambda i: i[1]) makes it clear the key function wouldn’t be applied to the search value. It’s like the bisect functions, where bisect(a, x, key=...) doesn’t apply the key function to the x value.

Here’s an itertools recipe off the top of my head:

from itertools import groupby

def keyed_index(iter, key, value):
    groups = groupby(map(key, iter), value.__eq__)
    try:
        does_match, elements = next(groups)
    except StopIteration:
        raise ValueError('not found') from None
    if does_match:
        return 0
    # have to do this before the next() call
    result = len(list(elements))
    try:
        next(groups)
    except StopIteration:
        raise ValueError('not found') from None
    return result

That looks rather awful :-P. Especially compared to

from operator import indexOf

def keyed_index(iterable, key, value):
    return indexOf(map(key, iterable), value)
1 Like

And if that mixed data contains elements of different types, having a key function seems particularly odd, as that function would have to handle inputs of different types.

Fun: for keyed_index(((1.0, 2.0), (2.0, 3.0)), lambda i: i[1], 42) you return index 0 instead of raising a “not found” error.

Ah, __eq__ won’t work like that with heterogeneous types, I keep forgetting. I’m also surprised that operator.indexOf apparently works with lazy iterators. Nice find, even better than an enumerate trick :wink: