# Add `index: bool` keyword to `min`, `max`, `sorted`

It is often useful to calculate the minimum index, maximum index, or sorted indices of an iterable, similar to `argmin`, `argmax`, `argsort` provided by `numpy` like modules. The current solutions (AFAIK) are quite slow if compared to what it would be if it were builtin. I doubt I will need to convince anyone about the usefulness of argmin, argmax, argsort, surely everyone here has needed such code, which is why I feel this must have been discussed before? If yes, I have not been able to find any reference to such discussion(s).

# Current Solutions

Here are the current solutions worth considering:

``````min(range(len(finite_iterable)), key=finite_iterable.__getitem__)
# iterates twice, only works with sequences, fast
min(enumerate(finite_iterable),  key=itemgetter(1))[0]
# significantly slower than above, which is slower than `min(finite_iterable)`
finite_iterable.index(min(finite_iterable))
# iterates twice, only valid for sequences
``````

# Proposal

The proposal is to add a new keyword-only argument `index: bool` which, if True, would return something like a `IndexedResult/ComparisonResult` object, with roughly the following signature:

``````@dataclass(frozen=True, slots=True)
class Indexed[T, I: (int, list[int])]:
value: T
index: I

def min[T: Comparable, V](
iterable: Iterable[T], *, default: V, key, index: Literal[True]
) -> Indexed[T | V, int]: ...

def min[T: Comparable, *Ts=Comparable](
arg1: T, arg2: T, *args: *Ts, key, index: Literal[True]
) -> Indexed[T, int]: ...

def sorted[T: Comparable](
iterable: Iterable[T], *, key, reverse, index: Literal[True]
) -> Indexed[list[T], list[int]]: ...
``````

Even though I havenât been able to find reference to past discussion(s) on this topic, I am convinced this must have come up. If yes, I will search harder.

1 Like

If the iterable is not a sequence, the ordering of the elements is not necessarily defined. It might be defined (as in `dict`) or it might be reliable due to implementation details, but it seems like it might be introducing a footgun if `min` etc might assume the input has a reliable order.

In any case, hereâs another couple ways to accomplish this, both were faster than your examples on my machine:

``````# if you have a sequence
finite_iterable.index(min(finite_iterable))
# if you don't but you _really_ want the index for some reason
min_it = min(finite_iterable)
next(i for i,it in enumerate(finite_iterable) if it == min_it)
``````
1 Like

Related past discussion about adding `argmin`/`argmax` methods to `list`:

1 Like

Try `operator.indexOf(finite_iterable, min(finite_iterable))`.

1 Like
``````>>> def argmin(seq):
...     return min(range(len(seq)), key=seq.__getitem__)
...
>>> argmin([10, 2, 12, 1, 14, 5])
3
``````

Is the first solution âquite slowâ or is it âfastâ? You say both. For argsort, I suspect it really is fast and doubt the parameter would make it much faster.

Although the genexp could use `it is min_it` instead of `it == min_it`. Could be faster, if `==` is expensive.

One more: `min(zip(finite_iterable, itertools.count()))[1])`

2 Likes

I guess you mean things like `set`? So when you say unreliable ordering, what exactly do you mean? Sure, you canât expect a `set` to have elements in the same order as you defined. But once defined, whatever the internal order is, remains unchanged, unless the set is modified. In which case even plain `min` would be unreliable.
Am I misunderstanding something?

I have this one in my post. Yes, I see how the other one would be faster.

It is fast compared to other two. Could be faster if len is known in advance. I expect your `indexOf` solution to be fast as well. But doesnât work for non-sequence iterable.

As far as I am concerned, for unordered sequences, min should work like this:

``````s = random_set
l = list(s)

assert min(s, index=True) == min(l, index=True)
``````

If for some other reason, it is still not acceptable, then simply require the iterable to be hashable as well. But I donât see why that would be required.

Hmm? What makes you think it doesnât?

Is that guaranteed anywhere? It always was for dictionaries, but not for sets.

1 Like

Via duck typing, lots of objects can act as an iterable, by implementing the necessary interface. They arenât required to support indexing, and they arenât required to return in a consistent order. Consider for instance a collection of asynchronous listenersâperhaps iterating over them returns based on time order, as they receive the signals they are listening for.

Using `index` wouldnât break in that instance, but it might lead to confusing bugs if the user wasnât careful. Basically, Iâm just saying that when functions support generic iterables, they shouldnât have options that only work for some iterables.

Given that this particular option is trivial to do in Python anyway, I think that confusing behavior isnât worth it.

1 Like

IndexOf calls âsequence.indexâ under the hood doesnât it? And surely, for iterators, the first min will consume it.

Maybe I am wrong, but the guarantee for dicts is it maintains the order oit was created in. And set is not guaranteed. But I expect once data is stored in a particular order, it doesnât change for readonly operations. Is this not true, or guaranteed? Is there any possible way this could not ne true?

This essentially defeats the purpose. If the notion of `index` is troublesome for set like things, better call it position.

One more thoughtâin general, I think itâd be more interesting to understand why the pure Python version isnât fast enough, and fix that problem. Adding special options to speed up particular niche use-cases seems like a recipe for bloat.

3 Likes

I see your points, regarding asynchronous iterables. Although I guess the user already has a footgun in their hand if they donât know what they are using, without need min with index. But point taken.

And yes, if the other options for non-sequence iterables were competitive enough, I would be glad to use them.

The problem in general seems like either there are two iterations involved, or the loop is run in python and not c.

No, it doesnât.

Sorry, I misspoke. I meant to say, I understand it requires a sequence as first argument. But I stand corrected. It does support non-sequence iterable. However, it would only work if min value is known in advance without consuming the iterable.

The guarantee currently is that it maintains insertion order. The old guarantee was that, if you called keys() and values() without mutating the dictionary in between, they would be in the same order - which effectively mandates that you can call keys() and keys() and theyâll be in the same order.

It all depends on how the data is stored internally. When the iteration order isnât guaranteed, itâs allowed to change at any time, for any reason. For example, a Python implementation could theoretically use a splay tree for its set implementation; and if the iteration order is âstart at the root, breadth-first searchâ, your effective iteration order will broadly correspond to how recently each item was accessed.