PEP 204 - Range Literals: Getting closure

Melendowski · March 22, 2023, 1:49am

Maybe my theory of complex numbers is rusty but reading the mgrid/ogrid docs, I don’t see how the use of complex numbers (specifically their magnitude) makes the way those two things operate intuitive at all.

Is it supposed to be some short hand for differentiating inclusive vs exclusive ends?

If range literals are accepted why not change the grammar to be more like an actual interval from the math definition using [] and ,()

CAM-Gerlach · March 22, 2023, 3:12am

As others have mentioned, @jbo , if a resulting PEP tries to be all things to all people and incorporate too many “crazy” ideas, it is almost certain to be rejected, so I suggest trying to keep things simple and focused. At the same time, the subset of functionality you do include has to offer a compelling-enough set of real-world advantages for users to be worth meeting the high bar of new syntax in the language.

It’s a difficult balance to strike, and it may simply be that there is no point along this optimization curve where both criteria meets the necessary thresholds for this overall feature. However, IMO, your best shot is to first gather the main ideas people are interested in and then incrementally pare them down to only the subset that have the most widespread interest and the fewest practical challenges and concerns, and try to see if consensus solidifies on that. If it does, then you have a shot at finding a sponsor and proposing a draft PEP.

If people are still going in different directions, you can exercise some discretion on your end, pare down the ideas to just the most viable and impactful, and then write a pre-PEP and see if that gains consensus. If people disagree with you, that will at least motivate them to write a competing pre-PEP, and if there’s enough interest, at least one can be formally proposed as a PEP.

Don’t forget about walking uphill to school in both directions

malemburg · March 22, 2023, 10:14am

To me, the only slightly more elegant solution instead of writing:

for i in range(10):
   print (i)

would be to use math style notation, e.g.

for i in 0,...,9:
   print (i)

That is: use the Ellipsis as indicator of an integer iterator, with both ends included in the iterator range.

Note that the above part after the “in” already is valid Python. It’s the tuple (0, Ellipsis, 9), so there isn’t much to add in terms of syntax. The compiler would just have to detect the notation and convert it to a range() iterator.

However, the above is not really a big readability saver and performance-wise, these for i in range() style loops don’t come up a lot in production code (they do in testing code, but performance is not that relevant in tests), so for me, the added language complexity doesn’t pay off.

Rosuav · March 22, 2023, 10:18am

The fact that it’s already valid Python must surely be a strong point against it, though?

malemburg · March 22, 2023, 11:02am

You can also put it that way, yes

>>> for i in 0,...,9:
...    print (i)
...
0
Ellipsis
9

Changing the compiler would introduce a backwards incompatible new interpretation. Albeit only in a very very rare case.

steven.daprano · March 22, 2023, 11:08am

The ellipsis … in maths is not a specific mathematical symbol or notation, it just carries its everyday usage of indicating “the pattern continues” or “values have been left out to save space”.

The usual notations I am aware of for indicating a range of values are variants of:

Inequality notation 0 <= i <= 9
Interval notation such as [0, 9] or [0, 10).

I suppose we could include an explicit set {0, 1, … 9}

(All assume that i is implicitly or explicitly an integer and not a real.)

If we left the commas out, that would match the common usage for closed intervals, e.g. 0...9. That might also work for character: 'a'...'z'.

Maybe ellipsis could do double-duty as an operator and a singleton object? I don’t think that would be ambiguous, but it would allow the nasty looking ... ... ... to mean “the range from Ellipsis to Ellipsis”, which presumably would be a runtime error, but syntactically legal.

Yeah, I kinda feel that as cute as this ellipsis notation is, its main use is to be taught to beginners

ntessore · March 22, 2023, 11:19am

To me, the beauty of the [::] range literal from PEP 204 was that you are not introducing totally new semantics if you imagine that a naked [] is a bit like slicing an infinite field of integers. Admittedly, the picture doesn’t really work for negative indices, but I still think there’s not much of a learning curve here.

steven.daprano · March 22, 2023, 11:58am

Slice objects are not range objects, and vice versa. Using the same syntax for both would be a mistake.

Despite the seeming similarities, there are many differences between the two, see for example:

As of Python 3.12, slice objects are hashable and can be used as dict
keys. Range objects are also hashable, so if we use the same syntax for
both, a subscript like obj[2:30:3] is ambiguous: is it a slice object,
or a
range object? For backwards compatibility, it would have to be a slice object, but that would mean that the same syntax makes different things depending on whether it is inside or outside of a subscript:

a = 2:20:3  # Outside of a subscript, it is a range object.
obj[2:20:3]  # Inside a subscript, it is a slice object.

and that’s going to be annoying and confusing. Consider somebody helpfully refactoring code like this:

a = start:stop:step
result = obj[a]

into the one liner result = obj[start:stop:step] and then finding that the behaviour changes. Or vice versa.

steven.daprano · March 22, 2023, 12:04pm

You would first have to teach people what an infinite field of integers is.

At least you didn’t mention monads

ntessore · March 22, 2023, 5:59pm

I notice that opinions on here seem to skew a little towards — at a guess — systems programming and web. Scientific programming in 2023 is basically Python ifs and fors on top of low-level array operations. These are codes in production and they are littered with ranges.

2pi360 · March 22, 2023, 7:29pm

I would add that in scientific programming another widely used language is R, and in R ranges like 1:100 are basic feature.

jbo · March 22, 2023, 7:36pm

Comparing range literals of other languages:

Python

range(1, 10, 2)

Ruby (dots are arguably reversed and step is a method)

(1..9).step(2)
(1...10).step(2)

Rust (step is a method)

(1..10).step_by(2)

Perl (no step support)

(1..9)

Haskell (step is done by subtracting next element from first: 3 - 1 = 2)

[1,3..10]

Scala (words can make it longer than just calling Range)

1 to 9 by 2
1 until 10 by 2

Julia and Matlab (step in the middle)

1:2:9

R (no step support)

1:9

Swift (explicit dots, but no step support)

1..<10
1...9

Some options for consideration:

Using dots seems complicated because of Ellipsis, but I could be wrong (e.g. this is already valid Python ....__class__)

start:stop:step	   # Used in Python slices and several languages
start..stop..step  # Used in a many other languages, missing the step part
start->stop->step  # Uses an existing token. Sense of direction
start=>stop=>step  # Similar to existing token, might confuse >= <=
start:>stop:step   # Sense of direction, except for step

steven.daprano · March 22, 2023, 8:46pm

Can you give some concrete examples of the sort of scientific programming you are talking about?

I’m not a big numpy user, but I understand that in numpy programming we avoid explicitly iterating over the indices of our array, and allow the library to perform that iteration at C speed.

steven.daprano · March 22, 2023, 8:53pm

This is a good review of languages with the feature, and it should go into the new PEP. Thank you.

ntessore · March 22, 2023, 8:58pm

Sure.

for i in range(maxiter):
    candidate = compute_with_fast_inner_loop(current)
    if np.fabs(candidate - desired) < tolerance:
        break
    current = propose_with_fast_inner_loop(current, candidate, desired)

The entire game is writing such high-level loops in Python. We don’t want to “hardcode” the control structure of our code at low level, only the bits that crunch numbers.

And nowadays with JIT compilers from numba, jax, etc. we actually often do write the inner loops in Python. Ranges everywhere.

jbo · March 22, 2023, 9:08pm

Regarding your arguments about slice syntax being confusing with the proposed range literals, I agree it is not perfect but I only proposed things that would be consistent with existing slices and could in fact return a slice object which would have more methods, like __iter__ and __len__ and would call __range__ on the datatype used by start/stop elements whenever accessed this way.

Some comments were made as to notice slices and ranges are conceptually different things, although I can argue that the list class and the range class both operates with slices and interpret them differently. Lists wrap around negative values because it thinks it is useful (they could just raise ValueError instead) while range objects works in the integer space where negative indexes exist.

So maybe I don’t want to define range literals, but slice literals that can produce range objects as a byproduct.

— Edit: I meant to send the part below as a separated post. —

I decided to look into the std lib for examples (there are 5163 matches for “range”) and I see that the majority of time the range objects are consumed directly as iterables, rarely if ever reused.

If there is interest in pursuing these things, I can post the real usage examples:

range is often used together with len. A range literal could take advantage of that interpret a collections.abc.Sized object that does not implement __range__ to its __len__ value.
Ranges are some times used as sets/frozensets due to non repeating values. It could implement collections.abc.Set protocol similar to what was done to dict_keys and dict_values objects.
chr/ord operations are low hanging fruits for using str/byte ranges instead.
Several classes could independently add range support for iteration:
- datetime.date and datetime.datetime
- ipaddress.IPv4Address and ipaddress.IPv6Address
- enum.Enum subclasses
Many functions write code that could be simplified with range objects take 2 or 3 arguments instead
- random.randrange docs says is equivalent to choice(range(start, stop, step))
- os.closerange take a range of file descriptors
- formatting of ranges with f-strings (e.g. some difflib helper functions)
As proposed in this thread, we could combine ranges with cartesian product similar to itertools.product using __matmul__ and replace nested for loops with a single one.
To change the step to a range starting from zero, we need to explicitly add the zero that was before not necessary. This is cosmetic but makes the code cleaner, similar to ignoring the leading zero in float numbers.

MRAB · March 23, 2023, 12:03am

I really don’t like the bare colons of 2:20:3. It’s especially bad if it’s part of a for loop, as in for i in 2:20:3:. I’d much prefer them to be in parentheses (2:20:3) or brackets [2:20:3].

As for ranges vs slices, perhaps we could use parentheses for one and brackets for the other, e.g. [2:20:3] for a range and (2:20:3) for a slice. The parentheses could be omitted when used in subscripts.

2pi360 · March 23, 2023, 12:58am

Thanks for the overview, It would be to see how these languages implement array indexing for comparison. I only know that R uses the same notation for ranges and indexing, 3:10 means all numbers form 3 to 10, my_vector[3:10] gives the elements of the vector from the third to the tenth. No confusion encountered ever.

Rosuav · March 23, 2023, 1:15am

People say the same thing about commas, too. And the same solution will be available. If the repr includes surrounding parentheses, would that make it easier?

RoadrunnerWMC · March 23, 2023, 1:41am

After reading the debate here about other languages’ inclusive/exclusive range syntax, I’d like to highlight Rust’s, which nobody’s mentioned yet (at least the inclusive version) but I think is pretty intuitive:

1..10     // exclusive
1..=10    // inclusive

Maybe this should be added to the summary post above?