I do like the idea of making the following addition to the operator module (there’s no need to make operator.index slower by adding an optional argument to it when a separate function will serve the purpose even more effectively):
def sequence_index(a, length):
i = a.__index__()
if i < 0:
i += length
if not (0 <= i < length):
raise IndexError(f"index {a!r} out of range")
return i
Beyond that, the challenge lies in making Sequence (and MutableSequence) subclasses meaningfully simpler to implement without causing other problems (such as introducing new ways to implement them incorrectly).
One clear weakness in the status quo is that the Sequence and __getitem__ documentation are both currently lacking is a description of how to correctly implement subscripting support in a way that mimics the behaviour of standard library sequences.
The __getitem__ documentation just says:
For sequence types, the accepted keys should be integers. Optionally, they may support slice objects as well. Negative index support is also optional.
Even Oscar’s sample code above contains an error, as it always returns a tuple for slices, when the type returned should be the same type as the instance being sliced (I’ve also inverted the sense of the type check, so only the one-liner simple case gets indented):
def _sequence_index(a, length):
i = a.__index__()
if i < 0:
i += length
if not (0 <= i < length):
raise IndexError(f"index {a!r} out of range")
return i
def __getitem__(self, key):
length = len(self)
if not isinstance(key, slice):
return self._get_sequence_item(_sequence_index(key, length))
# Return same type as this instance for slices
cls = type(self)
return cls(self._get_sequence_item(i) for i in range(*key.indices(length)))
At the very least, we should be providing that template code in the docs somewhere (probably as part of the Sequence ABC docs, with a specific mention in the __getitem__ docs).
Similarly for __setitem__ and __delitem__, as getting those right can be even fiddlier:
def __setitem__(self, key, value):
length = len(self)
if not isinstance(key, slice):
self._set_sequence_item(_sequence_index(key, length), value)
return
# This is based on the way that slice assignment to Python lists works
# (but accepts all Sequence instances without copying, not just list and tuple)
# It's complicated enough that it would need testing to show it is correct as written
start, stop, step = key.indices(length) # Check for a valid slice before anything else
value_needs_conversion = value is self or not isinstance(value, Sequence)
value_as_seq = list(value) if value_needs_conversion else value
slice_indices = range(start, stop, step)
slice_length = len(slice_indices)
value_length = len(value_as_seq)
if step != 1:
# Non-contiguous and reversed slices can't change the sequence length
if slice_length != value_length:
f"attempt to assign iterable of size {value_length} to extended slice of size {slice_length}"
raise ValueError(msg)
for i, item in zip(slice_indices, value_as_seq):
self._set_sequence_item(i, item)
return
# Contiguous forward slices can change the sequence length by adding or removing items
# Existing entries are replaced up to the common length, then removed if the existing
# slice is longer than the assigned value, or added if the assigned value has more entries
if value_length == 0:
self.clear()
return
for i, item in zip(slice_indices, value_as_seq):
self._set_sequence_item(i, item)
if slice_length > value_length:
del_start = start + value_length
del_stop = start + slice_length
for i in reversed(range(del_start, del_stop)):
self._del_sequence_item(i)
elif slice_length < value_length:
for offset in range(slice_length, value_length):
self.insert(start+offset, value_as_seq[offset])
def __delitem__(self, key):
length = len(self)
if not isinstance(key, slice):
self._del_sequence_item(_sequence_index(key, length))
return
# Reverse iteration to avoid incorrect indexing as entries are removed
for i in reversed(range(*key.indices(length)):
self._del_sequence_item(i)
return
If something were to be added to collections.abc, then my suggestion would be to leave Sequence and MutableSequence alone, and instead find another way to add implementation support. Many custom container implementations are thin wrappers around an existing container type, so delegating the index and slice interpretation functionality to the wrapped container may make more sense than reimplementing it in terms of acces to the individual entries, so I don’t think we want to lose the existing ABC warnings about the required abstract methods.
The first option that occurred to me is to add SequenceMixin and MutableSequenceMixin classes that provide __getitem__, __setitem__, and __delitem__ support along the above lines.
The other option is to instead provide standalone helper functions that accept an already bound method for the individual item access, along the following lines:
def sequence_getitem(seq, key, get_single_item=None):
if get_single_item is None:
get_single_item = seq._get_single_item
length = len(seq)
if not isinstance(key, slice):
return get_single_item(_sequence_index(key, length))
# Return same type as this instance for slices
cls = type(self)
return cls(get_single_item(i) for i in range(*key.indices(length)))
# Use like:
class MySequence(Sequence):
def _get_sequence_item(self, key):
...
def __getitem__(self, key):
return sequence_getitem(self, key, self._get_sequence_item)
The main benefit of the latter approach is that it’s more flexible when it comes to integrating it into existing code, since it doesn’t require adding a new mixin class to the inheritance heirarchy.