Simpler alternative for next(iter(x))

xitop · March 22, 2025, 8:38am

The purpose of next(iter(x)) is to get the first item from x, if x is ordered, or any item otherwise. The x itself is not modified.

I think a specialized function called for example getone(x) would bring some benefits over next(iter(x)). I’m just not sure if those benefits are sufficient to justify a change. That’s why I don’t have a well prepared proposal.

My opinion is:

Readability-wise is next(iter(x)) not great. It is difficult to understand until you remember it as an idiom.

A lot more interesting question is:

could getone() have better performance?

I don’t know for sure. Of course we can immediately return x[0] for all lists, tuples and strings, but calls of the type next(iter(plain_sequence)) must be quite rare. But I suspect that with little bit of C code we can avoid the creation of a temporary iterator for dicts and sets. If dicts and sets are used in majority of use-cases (again, I don’t have data), not having to create a temporary iterator object could be an improvement worth further discussion.

JamesParrott · March 22, 2025, 9:20am

more_itertools.first does this.

xitop · March 22, 2025, 9:52am

Thank you for this information.

The mentioned first provides an optional default value to be returned when input is empty. Let’s postpone this sub-topic for later.

The core of that function is:

    for item in iterable:
        return item

i.e. no optimisation, which was my main point.

Stefan2 · March 22, 2025, 10:05am

I don’t see how next(iter(x)) is “difficult to understand”. I find it completely obvious. Straightforward combination of two of the most basic built-ins.

storchaka · March 22, 2025, 10:10am

This is one of ideas that pops up every year. You can find past discussions here and on the old mailing lists. The main problems:

It is too niche for builtins. So it can only be added in itertools, and still, all other itertools generators either much more used or much more complex, so it may not reach the bar. Many users would razer continue to use obvious next(iter(x)) idiom (which works in all versions) than add an import.
It cannot achieve one of your goals – getting the first item without modifying the input. If x is a file, a generator, etc, it will be modified. And this makes the code more errorprone. With next(iter(x)) this is at least more explicit.

xitop · March 22, 2025, 10:36am

While it is more or less obvious what does next(iter(x)) do, the “other way around” it’s not. Imagine a beginner wanting to get one element from a set. I find it more probable that (s)he will end up with pop+add as with next(iter(...)).

oscarbenjamin · March 22, 2025, 10:39am

The most important difference between first and next(iter) is that when the iterable is empty first raises ValueError rather than StopIteration. Leaking StopIteration is bad because of the way that it interacts with other loop constructs. There was a PEP precisely to prevent this problem with generators but it is still unsafe to leak StopIteration when using other things like map.

Personally I think that there should be a first function in builtins precisely to stop people from being tempted to use next badly.

As evidence of this bad use there is even a ruff rule RUF015 that gives out bad advice for using next and some people try to apply its unsafe fixer to codebases e.g. this SymPy PR.

xitop · March 22, 2025, 11:07am

I’m sorry for not noticing them. They have probably used different keywords.

It’s related to the difference between an iterable and an iterator, but you’re right. This is serious enough to kill the idea.

Monarch · March 22, 2025, 11:49am

I’m not too sure about builtins but I’ll definitely love to see it in itertools. I think it’s definitely one of those things that’s not worth adding a dependency (more-itertools) for but the naive approach of next(iter(x)) can be a footgun if you aren’t aware of StopIteration’s semantics.

gerardw · March 22, 2025, 12:32pm

I rarely use iter(), if at all.

Aren’t these equivalent?

def func()->Generator[int, None, None]:

first = next(iter(func()))
print(first)


for first in func():
    break
print(first)

Dutcho · March 22, 2025, 9:26pm

In case of an empty iterator/generator, the thrown exception differs: StopIteration vs. NameError

xitop · March 23, 2025, 7:18am

Thanks to all feedback. Let me briefly summarize:

Main objections were:

usage frequency is not high enough to get into the bultins
next(iter(x)) updates x under some circumstances (when x is already n iterator)

Support was based on:

StopIteration could be raised. This exception is special.

For completness, why I wasn’t satisfied with next(iter(x)):

a temporary iterator must be created (and destroyed) each time. In some cases it seems unnecesary.

Taking all these points into account, I see this possible solution:

Let’s add optimized set.getone() and dict.getone() and let’s continue to use next(iter()) for everything else.

Of course, there are many missing details, but it ticks several boxes:

it covers a lot of next(iter()) use-cases
it looks like an obvious way to get one item/element
it could be implemented faster (no iterator needed)
no change to built-ins
no StopIteration

oscarbenjamin · March 23, 2025, 12:13pm

Do the existing set.pop and dict.popitem methods do what you want here?

>>> {1,2}.pop()
1
>>> {1:2}.popitem()
(1, 2)

Those aren’t equivalent because they mutate the set/dict but looking through the examples in the PR that I linked in most cases the intention is just to get the single element from a set of size 1 and then discard the set. We don’t want to mutate the set but we also don’t care about mutating it.

In fact most of those cases could just be written as

[a] = b

but there is no equivalent of this unpacking that can be used inline in an expression. This is actually more_itertools’ one function rather than first with the difference being that it also raises an exception if the iterable is not of length 1.

More generally if you want to get the first item from an iterable then it makes sense to have a function like first that works for any iterable and uses the general iterator protocol. This is what people want when they use next(iter(obj)). In my experience 99% of the time if someone suggests using next then it would be strictly better to use first with the important difference being just which exception is raised on an empty iterable.

The only situation in which it is potentially correct to raise StopIteration is in the __next__ method of an iterator:

class map:

    def __init__(self, func, iterable):
        self.func = func
        self.iterator = iter(iterable)

    def __next__(self):
        # If the underlying iterator is exhausted
        # then we want to propagate StopIteration
        return self.func(next(self.iterator))

If you find yourself using next in anything that is not the body of a __next__ method then it would be better to use first instead because raising StopIteration is always wrong. Correct use of next in these other contexts needs to either catch the exception or pass a default value so that next will catch the exception.

xitop · March 23, 2025, 1:58pm

Sometimes I don’t want to modify existing data. And if I don’t care about the rest of the set/dict, its modification is useless. I had the impression that Python developers were trying really hard to improve the execution speed in last years.

By far the most of the use-cases I saw are related to the dicts. They are ordered and if this feature is actively used in the code, chances are the very first item has an important meaning. Some programs need the first key, some the first value. Maybe a dict.getfirstitem() would be a more descriptive name for a method doing what they need.

Other uses-cases are infrequent:

I think I saw next(iter(xset)) in some graph-related algorithms, can’t find it now.
logging a set of error messages as “some_error_message and N others”
code that I don’t understand. For example asyncio/locks.py contains:

        # note: all comments added
        # self._waiters is either None or a deque
        if not self._waiters:
            return
        # after the test above it cannot be None, it cannot be empty,
        # so why not just "fut = self._waiters[0]" ?
        try:
            fut = next(iter(self._waiters))
        except StopIteration:
            return

gst · March 23, 2025, 3:07pm

For this asyncio/lock.py example : isn’t it for being many-threads safe ? the _waiters attribute could be modified (emptied/flushed) by some other thread thus (after the if check on its truthy value, but before the next(iter()) over it) ?

oscarbenjamin · March 23, 2025, 3:24pm

They are but optimising this particular operation is unlikely to bring meaningful improvement to the speed of real programs so:

I doubt that anyone who is profiling and benchmarking anything would identify this as being a worthwhile target for optimisation.
Performance is a weak motivation for adding any of the methods or functions that have been suggested.

xitop · March 23, 2025, 5:55pm

There is also that issue with StopIteration you wrote about yesterday. It got many likes (), more than my posts. Are those two points together still weak?

For completness, a benchmark of something comparable:

$ python -m timeit -s "lst=[1,2,3,4]" "first=next(iter(lst))"
# 62 ns
$ python -m timeit -s "lst=[1,2,3,4]" "first=lst[0]"
# 19 ns

Yes, saving 40 ns or so on a statement rarely executed is really “nothing”. OTOH the overhead is +220%.

Rosuav · March 23, 2025, 7:32pm

Xitop:

For completness, a benchmark of something comparable:
$ python -m timeit -s "lst=[1,2,3,4]" "first=next(iter(lst))"
# 62 ns
$ python -m timeit -s "lst=[1,2,3,4]" "first=lst[0]"
# 19 ns
Yes, saving 40 ns or so on a statement rarely executed is really “nothing”. OTOH the overhead is +220%.

Not actually very comparable. On a microbenchmark like this, the fact that one of your examples is looking up three globals and the other is looking up one global is highly relevant.

xitop · March 23, 2025, 7:56pm

Might be. I write applications. I cannot read bytecode and I did not study Python internals. Could somebody more experienced please be so kind and post a corrected benchmark?

I was comparing a direct access to the first element vs. next(iter()). I cannot do that for sets and dicts, so I decided to use a sequence as a closest match.

oscarbenjamin · March 23, 2025, 8:13pm

If looking up the globals is the main cost of the operation then it is reasonable to measure that cost in the microbenchmark. The important performance question is: when would any real task be dominated by the cost of getting the first item from an iterable?

Almost by definition you won’t have a tight loop in which you need to get the first item from an iterable many times because an iterable only has one first item. I can come up with contrived situations where you have many sets and need to get the “first” item of each in a tight loop but I find it hard to imagine a real situation where the cost of this operation is a bottleneck. In a real situation you would at minimum also have the cost of creating the many iterables or of doing something with the many first items from the iterables.

There are much better things to focus on if you want to make real Python programs faster.

Personally I am convinced by the StopIteration issue alone. Above it is suggested that wanting an alternative to next(iter(...)) is “too niche for builtins”. I would rather say that the situations where it is reasonable to use next rather than first are niche and the vast majority of uses of next in the wild (grep.app) would be better served by something like first. The situation is that we already have the wrong function as a builtin and people use it even though it’s wrong because it is a builtin.