CPython doesn’t iterate in hash order, it just happens to produce integers from -5 to 256 in order. Integers outside that range don’t work.
>>> a = [*range(1, 50), *range(61, 101)]
>>> a
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]
Although, if we want the benefits of range instead of loading everything into memory, I think another possibility with a broader use case (for fun, not at all advocating for it) could be adding an optional filter parameter to the range method.
>>> a = range(1, 101, filter=lambda x: x < 50 and x > 60)
I may have changed my mind on this one
a=range(60)
b=range(20,33)
def skip(a,b):
return [i for i in a if not in b]
if skip returns a list it would still iterate the same.
I don’t see me using it, but it’s not a bad idea.
It doesn’t always even produce small ints in order:
>>> s = set(range(20))
>>> for i in range(0, 20, 2): s.remove(i)
...
>>> for i in range(0, 20, 2): s.add(i)
...
>>> s
{1, 3, 5, 7, 9, 6, 11, 4, 13, 0, 15, 8, 17, 2, 19, 12, 14, 16, 18, 10}
You can’t assume anything about the order a set will iterate its elements.
I think you meant x < 50 or x > 60, but if that were to be your strategy, you would be better off just using filter(lambda x: x < 50 or x > 60, range(1, 101)). Since this triggers a function call for every item, itertools.chain(range(1, 50), range(61, 101)) would work just fine. Alternatively, you could have a class that implements __add__ to wrap itertools.chain to concatenate iterables with +.
I I think the use case is too specific. Someone may want to have more than one range of numbers with more than one range of exemption ranges. What about negative steps? steps other than +/- 1? What about iterating of the resultant in reverse? or the resultant with its own start, stop and step? What about __contains__?
(There’s probably a way to do all that by calculations on sub-range arguments and use of next on range iterators/reversed iterators - i.e. wityhout creating a set of all possible integers then iterating through that, in order, but I haven’t done it).
In my cursory testing, it seems that one of the fastest ways to accomplish this is by chaining ranges in the gaps. Here’s a rough skip_range class that builds on @Rosuav 's MultiRange idea.
from __future__ import annotations
from collections.abc import Iterator
from itertools import chain
class skip_range:
def __init__(self, base: range, *skips: range) -> None:
if base.step != 1 or any(s.step != 1 for s in skips):
raise ValueError('`skip_range` ranges must have a step of 1')
self.base = base
self.skips = list(skips)
self.ranges = tuple(self._build_ranges(*skips))
def _build_ranges(self, *skips: range):
start = self.base.start
stop = self.base.stop
ranges = list[range]()
for skip in sorted(skips, key=lambda r: r.start):
rng = range(min(skip.start, start), min(skip.start, stop))
if rng:
ranges.append(rng)
start = max(skip.stop, rng.stop, start)
if start < stop:
ranges.append(range(start, stop))
return ranges
def __repr__(self) -> str:
return f'{self.ranges}'
def __iter__(self) -> Iterator[int]:
return chain(*self.ranges)
def __len__(self) -> int:
return sum(len(r) for r in self.ranges)
def __sub__(self, rng: range) -> skip_range:
return skip_range(self.base, *(self.skips + [rng]))
Here’s some rough performance testing using the magic %timeit
List Comprehension
>>> skip = [range(50, 100), range(200, 300), range(900, 1000)]
>>> %timeit [i for i in range(1000) if not any(i in s for s in skip)]
343 μs ± 1.39 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
set
>>> nums = ... # set of skip ranges
>>> %timeit [i for i in range(1000) if i not in nums]
30.6 μs ± 117 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
skip_range
>>> s = skip_range(range(1000), *skip)
>>> %timeit list(s)
7.73 μs ± 20.1 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
These are not perfect tests, but it definitely seems that pre-computing the ranges and then chaining them is faster than the other options.
If you’re skipping part of a range then usually there’s a reason why. The if cond: continue section provides a great place to leave a comment.
CODEPOINT_MIN = 0x0000
CODEPOINT_SUP = 0x11_0000
CODEPOINT_RANGE = range(CODEPOINT_MIN, CODEPOINT_SUP)
SURROGATE_MIN = 0xD800
SURROGATE_SUP = 0xE000
SURROGATE_RANGE = range(SURROGATE_MIN, SURROGATE_SUP)
for codepoint in CODEPOINT_RANGE:
c = chr(codepoint)
if codepoint in SURROGATE_RANGE:
# Surrogate range not encodable
continue
c_utf_8_bytes = c.encode()
assert c_utf_8_bytes[0] < 0xF5
Yeah
You could also just leave the comments at the start of the for loop, e.g.:
# Exclude (skip) non-encodable characters.
for c in range(CODEPOINT_MIN, CODEPOINT_SUP).skip(SURROGATE_RANGE):
c = chr(c)
c_utf_8_bytes = c.encode()
assert c_utf_8_bytes[0] < 0xF5
Still, this would not have to require a builtin. Perhaps, an addition could be made to the itertools module?
Indeed there is. I coded it and created a blog entry on my/a solution. It takes many ranges that define ints that could be generated from the sparse_range and many ranges of ints that are to be excluded then when you call the instance with it’s top level start, stop, and step, it chooses from the original inclusion and exclusion sets of ranges, in order, and without expanding any ranges, so memory is conserved.