Range(0, 10, 0.5) with float step

I have a list generated by:

x \* 0.5 for x in range(0, 21)

This gives me float values from 0.0 to 10.0 with a step of 0.5:

0.0, 0.5, 1.0, 1.5, 2.0, …, 10.0

I want to improve the granularity by expressing the sequence in a fractional step format, conceptually like:
range(0, 10.0, 0.5)

But since Python’s built-in range() does not accept float steps, this exact syntax does not work.

Could you please suggest the cleanest and most Pythonic way to rewrite this loop so that it looks closer to the fractional step idea

def anyrange(start, end=None, step=1.0):
    if end is None:
        end = start
        start = 0.0

    if step == 0:
        raise ValueError("step cannot be 0")

    i = 0
    current = start

    if step > 0:
        while current < end:
            yield current
            i += 1
            current = start + i * step
    else:
        while current > end:
            yield current
            i += 1
            current = start + i * step

I would like to request a feature for newer Python versions:

It would be great if the built-in range() function could accept floating-point numbers, for example:

range(0, 10.0, 0.5)

Currently, range() only works with integers. Allowing floats would make it much more versatile and convenient for numerical tasks.

For reference, NumPy already provides this functionality using numpy.arange():

import numpy as np

arr = np.arange(0, 10, 0.5)
print(arr)
# Output: [0.  0.5  1.  1.5  2. ... 9.5]

Adding similar support to the built-in range() would make it easier for users to work with floating-point sequences without relying on external libraries.

You can do the same with this:

for i in range(0, 21):

    print(f'{i /2}', end=' ')

It just takes a little bit more getting familiar with Python’s capabilities is all.

As the numpy docs say, even there arange() is discouraged for non-integer steps - “floating point suprrises” make it error-prone.

They recommend numpy.linspace() instead.

Example:

>>> import numpy
>>> numpy.arange(0.0, 1.0, 1/3) # does not include 1.0
array([0.        , 0.33333333, 0.66666667])
>>> numpy.arange(0.0, 1.0, 1/49) # does include 1.0
array([0.        , 0.02040816, 0.04081633, 0.06122449, 0.08163265,
       0.10204082, 0.12244898, 0.14285714, 0.16326531, 0.18367347,
       0.20408163, 0.2244898 , 0.24489796, 0.26530612, 0.28571429,
       ...
       0.71428571, 0.73469388, 0.75510204, 0.7755102 , 0.79591837,
       0.81632653, 0.83673469, 0.85714286, 0.87755102, 0.89795918,
       0.91836735, 0.93877551, 0.95918367, 0.97959184, 1.        ])

Although it’s not actually 1.0 - it was rounded up to 1.0 for display. Its actual value:

>>> 1/49*49
0.9999999999999999

Which is again rounded for display - it’s actual actual :wink: value:

>>> import decimal
>>> decimal.Decimal(1/49*49)
Decimal('0.99999999999999988897769753748434595763683319091796875')
5 Likes

One solution is to allow real steps instead, so one can specify a fraction as a step as needed.

I know numeric_range requires a third party library, but if a special case can be made to use any dependency, it can be made for more-itertools:

>>> from more_itertools import numeric_range
>>> list(numeric_range(0, 10.0, 0.5))
[0.0, 0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7.0, 7.5, 8.0, 8.5, 9.0, 9.5]

Fractions, Decimals, and even datetimes & timedeltas are also supported by it.

1 Like

Generate the numerators of 0/2, 1/2, 2/2, 3/2, etc, then divided each numerator by 2. This reduces the kind of error that can accumulate when adding a series of floating-point values.

[x/2 for x in range(0, 21*2)]

In general, [x/d for x in range(0, 21*d)].

2 Likes

Right. I do wish for support for datetime/timedelta sometimes. Conceptually range can support any type that supports additions and comparisons. We may get questions regarding why range(0, 49, 1/49) isn’t behaving as expected from newcomers as a result but that isn’t really any different from or worse than the numerous newbie questions we already have from floating point operations today.

1 Like

Although this is prone to some of the same numeric surprises as numpy.arange():

>>> from more_itertools import numeric_range
>>> x = list(numeric_range(0.0, 1.0, 1/49))
>>>  len(x) # not 49!
50
>>> x[-3:]
[0.9591836734693877, 0.9795918367346939, 0.9999999999999999]

While the endpoint (1.0) is exactly representable, there is no integer i such that the correctly rounded result of (1/49)*i is 1. 49 happens to be the smallest int for which that’s true, so it’s not common. Hence “surprise!” :wink:.

3 Likes

numeric_range doesn’t quite support datetime:

from datetime import datetime, timedelta, UTC
from pprint import pprint
import zoneinfo
from more_itertools import numeric_range

AMS = zoneinfo.ZoneInfo("Europe/Amsterdam")

pprint([
    t.astimezone(UTC)
    for t in numeric_range(
        datetime(2000, 3, 26, 1, tzinfo=AMS),
        datetime(2000, 3, 26, 4, tzinfo=AMS),
        timedelta(minutes=30),
    )
])
print()
pprint([
    t.astimezone(UTC)
    for t in numeric_range(
        datetime(2000, 10, 29, 1, tzinfo=AMS),
        datetime(2000, 10, 29, 4, tzinfo=AMS),
        timedelta(minutes=30),
    )
])

outputs:

[datetime.datetime(2000, 3, 26, 0, 0, tzinfo=datetime.timezone.utc),
 datetime.datetime(2000, 3, 26, 0, 30, tzinfo=datetime.timezone.utc),
 datetime.datetime(2000, 3, 26, 1, 0, tzinfo=datetime.timezone.utc),
 datetime.datetime(2000, 3, 26, 1, 30, tzinfo=datetime.timezone.utc),
 datetime.datetime(2000, 3, 26, 1, 0, tzinfo=datetime.timezone.utc),
 datetime.datetime(2000, 3, 26, 1, 30, tzinfo=datetime.timezone.utc)]

[datetime.datetime(2000, 10, 28, 23, 0, tzinfo=datetime.timezone.utc),
 datetime.datetime(2000, 10, 28, 23, 30, tzinfo=datetime.timezone.utc),
 datetime.datetime(2000, 10, 29, 0, 0, tzinfo=datetime.timezone.utc),
 datetime.datetime(2000, 10, 29, 0, 30, tzinfo=datetime.timezone.utc),
 datetime.datetime(2000, 10, 29, 2, 0, tzinfo=datetime.timezone.utc),
 datetime.datetime(2000, 10, 29, 2, 30, tzinfo=datetime.timezone.utc)]

Though I mostly blame datetime for this. Why would you even mess up addition of small time-deltas to timezone aware datetimes? :upside_down_face:

1 Like

Not sure what you’re asking, but “what time is it in thirty minutes” is a fundamentally hard question when DST is switching past you :slight_smile:

3 Likes

Aren’t standard floating point precision issues just to be expected when doing any similar math with floats?

It’s not really that surprising to me anyway. If it’s an actual bug, I’'m keen to get stuck into it though.

[Edit] As Tim hints, by showing the length is 50 not 49, this ‘floating point off by one’ boils down to:

distance = 1.0
step = 1 / 49
_zero = type(step)(0)    # 0.0
q, r = divmod(distance, step)    # e.g. r == 7.979727989493313e-17 
_len = int(q) + int(r != _zero)

OP asked for a range with a float step. numeric_range is exactly that, with all the associated issues one would expect. IMHO it’s worth a comment and a sentence in the docs warning about this. It’s not worth wrapping new Python users up in cotton wool, depriving them from the chance learn how floats really work, over something like a range with a float step, the correctness of which is open to interpretation.

Far better to base it on int ranges. Or write your own generator, to express exactly which range you want though.

Really interesting - thanks.

On the surface, it looks to me as if the implementation of numeric_range could be made generic (over some Protocol defining the arithmetic operations it does use). It definitely doesn’t do anything specialised with datetimes or timedeltas in particular. So the user could pick dateutil, or any other library that supports + and - for time like objects, that behaves around daylight savings clock changes however they please, and numeric_range will probably work fine with it.

But changing, or ‘fixing’ datetime, will have a huge blast radius. And even with the daylight savings time clock changes, your code example intentionally coerces to UTC. It behaves more intuitively if the two t.astimezone(GMT)s are replaced with t.astimezone(AMS), (but I don’t claim this is correct - is AMS time at a “clocks change” supposed to skip an hour entirely, or repeat an hour?):

[datetime.datetime(2000, 3, 26, 1, 0, tzinfo=zoneinfo.ZoneInfo(key='Europe/Amsterdam')),
 datetime.datetime(2000, 3, 26, 1, 30, tzinfo=zoneinfo.ZoneInfo(key='Europe/Amsterdam')),
 datetime.datetime(2000, 3, 26, 2, 0, tzinfo=zoneinfo.ZoneInfo(key='Europe/Amsterdam')),
 datetime.datetime(2000, 3, 26, 2, 30, tzinfo=zoneinfo.ZoneInfo(key='Europe/Amsterdam')),
 datetime.datetime(2000, 3, 26, 3, 0, tzinfo=zoneinfo.ZoneInfo(key='Europe/Amsterdam')),
 datetime.datetime(2000, 3, 26, 3, 30, tzinfo=zoneinfo.ZoneInfo(key='Europe/Amsterdam'))]

[datetime.datetime(2000, 10, 29, 1, 0, tzinfo=zoneinfo.ZoneInfo(key='Europe/Amsterdam')),
 datetime.datetime(2000, 10, 29, 1, 30, tzinfo=zoneinfo.ZoneInfo(key='Europe/Amsterdam')),
 datetime.datetime(2000, 10, 29, 2, 0, tzinfo=zoneinfo.ZoneInfo(key='Europe/Amsterdam')),
 datetime.datetime(2000, 10, 29, 2, 30, tzinfo=zoneinfo.ZoneInfo(key='Europe/Amsterdam')),
 datetime.datetime(2000, 10, 29, 3, 0, tzinfo=zoneinfo.ZoneInfo(key='Europe/Amsterdam')),
 datetime.datetime(2000, 10, 29, 3, 30, tzinfo=zoneinfo.ZoneInfo(key='Europe/Amsterdam'))]

(I had to install tzdata)

1 Like

datetime.datetime(2000, 3, 26, 3, 30, tzinfo=zoneinfo.ZoneInfo(key='Europe/Amsterdam')) doesn’t really exist. The clock in the Amsterdam timezone skips straight from (2000, 3, 26, 2, 59) to (2000, 3, 26, 4, 00). The .astimezone(UTC) is there in my code to reveal that something ‘wrong’ is going on.

Because datetime.datetime(2000, 3, 26, 3, 30, tzinfo=zoneinfo.ZoneInfo(key='Europe/Amsterdam')) doesn’t exist, Python coerces it to be the same as datetime.datetime(2000, 3, 26, 2, 30, tzinfo=zoneinfo.ZoneInfo(key='Europe/Amsterdam')) when you convert time zones. That seems better than raising an error. (Even if I think ideally it wouldn’t have been possible to create a datetime.datetime(2000, 3, 26, 3, 30, tzinfo=zoneinfo.ZoneInfo(key='Europe/Amsterdam')) object.)

I don’t think datetime.datetime can be ‘fixed’ at this point. As you imply, any changes to the existing classes would break a lot of libraries.
I do think the datetime module can be fixed, by including a class in there that can be characterised approximately as

@dataclass(slots=True)
class TZDatetime:
    utc_dt: datetime
    tz: zoneinfo.ZoneInfo

    def __post_init__(self):
        if TYPE_CHECKING:
            assert self.utc_dt.tzinfo in {None, UTC}

with the appropriate utility methods & properties attached.[1]

This would put a class into the standard library that can be used as an interface between different libraries for timezone-aware datetimes. (Currently everyone seems to use datetime.datetime for the interface which, while probably the correct choice, does blow up in my face from time to time.)


  1. I don’t mean to imply this would be trivial to implement. ↩︎

1 Like

Already pointed to the numpy docs, where numpy.arange() has the same issues. They recommend against using arange() with a floating step, pointing instead to numpy.linspace(). The latter doesn’t have a step argument, but rather a num argument, giving the (integer) number of values to produce.

That’s best practice.

>>> import numpy
>>> x = numpy.linspace(0.0, 1.0, 49)
>>> len(x)
49
>>> x[-1]
np.float64(1.0)
>>> x = numpy.linspace(0.0, 1.0, 49, endpoint=False)
>>> len(x)
49
>>> x[-1]
np.float64(0.9795918367346939)
2 Likes

Are you already aware of the problems with floating-point arithmetic? If not, the Wikipedia article on floats is a pretty good place to start. Given those inaccuracies, floats as steps are kind of scary. I like your (x*0.5 for x in range(0, 21)) solution. If you want something easier to read, (x*0.1 for x in range(11)) might also work since shifting the decimal place left is pretty intuitive.

Sure. Linspace works great in Matlab (in which ‘everything’ is a matrix anyway). And OP did specify range(0, 10, 0.5).

But for large arrays, Python’s native range is a much more powerful indexable iterator (with extra metadata) that requires far less RAM, and I would argue for the small price of managing standard floating point issues themselves, is much more powerful for the programmer.

1 Like

That seems like a bad idea. Printing those numbers gives me:

0.0 0.1 0.2 0.30000000000000004 0.4 0.5 0.6000000000000001 0.7000000000000001 0.8 0.9 1.0

With x/10 instead, I get:

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

3 Likes

The solution needs a way to make the expression (like 1/3) to Decimal which can avoid the question caused by float. Now Decimal("1/3") is wrong.

For more complex expression, sometimes it is hard to express all number with Decimal.