# Optionally include pair of last and first element of iterator in `itertools.pairwise`

Proposal:

I propose to add an optional argument to the `pairwise` function in the `itertools` module. When this argument is set to `True`, the iterator returned by `pairwise` should include a pair of the last and the first element of the input iterator (if the input iterator is finite).
I also suggest `circular` as the name for this argument.

Example:

``````>>> list(pairwise(['A', 'B', 'C', 'D'], circular = True))
[('A', 'B'), ('B', 'C'), ('C', 'D'), ('D', 'A')]
``````

Why?

I found this extension of `pairwise` to be useful when working with geometrical shapes, specifically polygons. When `corners` is the (ordered) list of corners of a polygon, `pairwise(corners, circular = True)` can be used to iterate over the edges of that polygon.

There are likely some other cases in which this can be useful, for example creating secret santa pairs or something of that kind, or more generally when working with circular linked list data structures.

Reference implementation:

``````from itertools import tee, chain

def pairwise(iterable, circular = False):
a, b = tee(iterable)
first = next(b, None)
return zip(a, chain(b, [first] if circular else []))
``````

Edge cases:

``````>>> list(pairwise(['A'], circular = True))
[('A', 'A')]
``````
``````>>> list(pairwise([], circular = True))
[]
``````
3 Likes

cc: @rhettinger

As an alternative, there could be a generic `circulator()`[1] itertool that traverses any iterable, repeating the first element at the end. The proposed functionality is then `pairwise(circulator(...))`.

1. name inspired by CGAL â†©ď¸Ž

Sounds like this is easily implemented in user code. You can also suggest it as an enhancement of `more-itertools`.

1 Like

Specfically you can do this with `islice`, `pairwise` and `cycle`:

``````>>> list(islice(pairwise(cycle(['a', 'b', 'c', 'd', 'e'])), 5))
[('a', 'b'), ('b', 'c'), ('c', 'd'), ('d', 'e'), ('e', 'a')]
``````

Only if you know the length. And thatâ€™s very wasteful memory-wise, storing all elements instead of just the first.

1 Like

It doesnâ€™t store anything extra, it cycles. But yes you need to know the length. If it doesnâ€™t have a length thereâ€™s no way to circularize.

I suppose if you have an arbitrary iterator that you know will end at some point, you could save memory by never storing the whole thing. But that is pretty niche? And not one of the examples in the op.

``````def circular(iterator):
iterator = iter(iterator)
first = next(iterator)
yield first
yield from iterator
yield first
``````
``````>>> from itertools import pairwise
>>> list(pairwise(circular("abcde")))
[('a', 'b'), ('b', 'c'), ('c', 'd'), ('d', 'e'), ('e', 'a')]
``````

(To be clear, this was just a response to the â€śIf it doesnâ€™t have a length, thereâ€™s no way to circularize.â€ť though reading through again, I suppose the exercise was not to write a new function.)

2 Likes

Yes, this is very easily implemented in user code, see the reference implementation in my original post.
I just think, that it would be a natural, generally useful and non-obtrusive extension of a function, which is already in the standard library.

Regarding `more-itertools`:
The `pairwise` function was added to the standard library in Python 3.10. `more-itertools` had a `pairwise` function before that, but starting with Python 3.10 it just re-exports the function from the standard library. I would not like to make a change to only the `more-itertools` version of the function. Also, if a change is made to `pairwise` in the standard library, i would suggest to make the same change to `pairwise` in `more-itertools`, for use in legacy Python versions.

Furthermore, i would also suggest similar extensions to the functions `triplewise` and `sliding_window` in `more-itertools`

It is not niche at all. It actually is one of the main reasons to even use an iterator instead of just a list or tuple. The most well known example is probably the new behavior of the `range` function.

But storing is how cycle works. When you call itertools.cycle on something, it makes a list to store the items in, appends them all as it finds them, and then repeatedly returns values from that list. For this specific use-case, itâ€™s overkill to save them all just to re-yield the first.

1 Like

I wasnâ€™t referring to the concept of iterators as niche, I was referring to the concept of â€ścircular pairwise iteratorâ€ť as niche.

In the various examples above, the input was a sequence. Does `cycle` save a new copy in that case? Reading the source, it looks like it does, which is kind of a shameâ€¦

If `iterator` is `itertools.repeat(...)` this isnâ€™t gonna circularize I should have said that â€śthe length must existâ€ť or something like that.

1 Like

Yes. The itertools functions/objects/tools/whatever you call them never assume that something is a sequence. Even if it IS a sequence, it would be extremely surprising if mutating after the first pass could sometimes affect it and sometimes not - imagine if `itertools.cycle(x)` behaved differently from the way `itertools.cycle(iter(x))` did (since the latter isnâ€™t receiving a sequence). Much cleaner to treat everything the same way and just iterate over it once.

1 Like

Thatâ€™ll crash if there are no elements. I might do one of these:

``````def circular(iterator):
iterator = iter(iterator)
for first in iterator:
yield first
yield from iterator
yield first
``````
``````def circular(iterator):
iterator = iter(iterator)
first = [*islice(iterator, 1)]
return chain(first, iterator, first)
``````

For `windowed`, this has already been suggested.

1 Like

Iâ€™d be inclined to take the simpler and clearer route then: if thereâ€™s no first element, yield nothing at all.

``````def circular(iterator):
iterator = iter(iterator)
try: first = next(iterator)
except StopIteration: return
yield first
yield from iterator
yield first
``````

Iâ€™d say thatâ€™s what every correct implementation does (unless raising some error is decided to be the correct thing to do, but I donâ€™t think so)

Agreed, but what I mean is, the simplest expression of that isnâ€™t â€śiterate over the iterator, and for every element in it, yield everything from the iteratorâ€ť. Now, thatâ€™s definitely equivalent for a well-behaved iterator, but I wouldnâ€™t call it a simple way to write that. What I provided definitely expresses that concept.

And there ARE broken iterators out there. Files, for example. Hereâ€™s a fully-worked example that behaves very oddly with the for-loop version:

``````import tempfile

def circular(iterator):
iterator = iter(iterator)
for first in iterator:
yield first
yield from iterator
yield first

with tempfile.NamedTemporaryFile(delete_on_close=False) as temp:
print(temp.name)
temp.write(b"First!\n")
temp.close()
with open(temp.name) as f:
for line in circular(f):
print(line.rstrip())
new = input("Enter line to append, or blank to skip: ")
if new:
with open(temp.name, "a") as f: print(new, file=f)
print("-- All done --")
``````

Various combinations of input here will result in surprising behaviour. For example, if you skip the first prompt, you get back â€śFirst!â€ť again (implying that weâ€™re now done - that was the circularization), but if you respond with text to the followup prompt, youâ€™ll get that line back twice more.

Sure, you can dismiss files as â€śbrokenâ€ť, but itâ€™s not like this is some pathological example that I had to craft from scratch.

1 Like

Thatâ€™s a well-known concept

(Since nobody was willing/able to explain how the above is â€śoffensive, abusive, to be hateful conduct or a violation of our community guidelinesâ€ť and I disagree and a moderator had left it alone after their edit, Iâ€™m assuming itâ€™s a mistake and the only edit Iâ€™ll do to unhide it is to add this note.)