Adding support for a subset of ISO 8601 durations in the `datetime` module

I propose to add support for a subset of ISO 8601 durations to the datetime module, which already supports parsing/exporting this standard for datetime objects.

In a nutshell, using the standard: "P3Y6M4DT12H30M5S" represents a duration of “three years, six months, four days, twelve hours, thirty minutes, and five seconds”.

Previous discussions on the topic:

As explained in those prior discussions, the main problem of ISO 8601 durations is that they do not all “map cleanly” on timedeltaobjects (which are essentially a fixed number of days/seconds):

  • "PT1S": timedelta(seconds=1)
  • "PT1M": timedelta(minutes=1)
  • "PT1H": timedelta(hours=1)
  • "P1D": timedelta(days=1)
  • "P1W": timedelta(weeks=1)
  • "P1M": not possible to convert to timedelta, as it is a “relative” concept (*)
  • "P1Y": not possible to convert to timedelta, as it is a “relative” concept (*)

(*) To expand on the “relative” nature of months/years: adding 1 month to different dates “adds” a different amount of time:

  • datetime(2026, 1, 1)+ 1 month: datetime(2026, 2, 1): 31 days
  • datetime(2026, 2, 1)+ 1 month: datetime(2026, 3, 1): 28 days

(*) To manage such relative durations, dateutil relativedelta works well.

As mentioned in the last message of bpo-42094, I suggest to implement only the part of the standard that can be safely represented as timedelta, and defer potential inclusion of a class for relative durations to later/never.

Examples

Recently, pip added support for relative durations for --uploaded-prior-to and ships a simple regex-based parsing of PnD. uv did the same for --exclude-newer, and also only supports days.

I personally have used my own parse_iso_duration variations in multiple projects.

Proposed API

I suggest to mirror datetime.fromisoformat and add timedelta.fromisoformat:

>>> from datetime import timedelta
>>> timedelta.fromisoformat("P2DT10H")
datetime.timedelta(days=2, seconds=36000)

For non-supported durations, we just raise:

>>> timedelta.fromisoformat("P1M")  # raises ValueError

We can also mirror datetime.isoformat and add timedelta.isoformat :

>>> td = timedelta(days=12, hours=4)
>>> td.isoformat()
"P12DT4H"

Process

My understanding of the dev guide is that such changes do not require a PEP, but a consensus here, and agreement from the datetime maintainers.

If there is interest in the proposal, I would be happy to work on an implementation.

10 Likes

Small correction, but uv supports a wider range of duration syntax: Resolution | uv

I believe they support whatever jiff supports: jiff::fmt::serde::duration - Rust

For pip we wanted to start with a very minimal syntax, as it looked that there were interesting edge cases, and we didn’t want to vendor or reimplement a full parser for one argument.

1 Like

You are absolutely right.
I actually use "12 hours" when generating uv.lock in most cases so I should have spotted this :slight_smile:

What I checked (before writing this post) was "P1M", which uv does not support: Duration uses ‘months’ which is not allowed (same with years).

For the “cooldown” use cases: I think supporting hours/days is enough, in that space:

  • pinact has --min-age in days (for pinning GitHub Actions)
  • prek autoupdate has --cooldown-days, also in days
1 Like

Because upload-time from PEP 700 is mandated to be in UTC, so days (and weeks) can be taken to be fixed length time fields, but months and years cannot.

However, if you can’t assume UTC then durations cannot be assumed to be 86400 seconds, and your proposed days to timedetla is logically incorrect.

Only hours, minutes, and seconds, should cleanly translate to a Python timedelta. Whereas days, weeks, months, and years are periods, and should not be incorrectly represented as fixed times.

3 Likes

The Wikipedia says that even 1D is relative, if DST/summer time change occurs that day. Setting 1D = always 24H is probably the only way for conversion to timedelta.

My understanding of the standard is that it is indeed “relative” for P1D (which matters only for non-UTC indeed).

Note that because of the datetime semantics, adding a timedelta object to a timezone-aware datetime does not add a fixed amount of time, but obeys the “clock logic” (more details in Paul Ganssle article):

from datetime import datetime, timedelta
from zoneinfo import ZoneInfo

tz = ZoneInfo("America/New_York")

before = datetime(2024, 3, 9, 12, 0, tzinfo=tz)
after = before + timedelta(days=1)  # DST happening!
# 2024-03-09T12:00:00-05:00: before
# 2024-03-10T12:00:00-04:00: after (clock + 24h)

# Even though we added "a day", the new datetime is only "23h later"
(after.timestamp() - before.timestamp()) / 3600  # 23.0

In practice, this means representing P1D as timedelta(days=1) would be incorrect in terms of “fixed duration logic”, but correct in terms of datetime arithmetic when used like this:

tz = ZoneInfo("America/New_York")
before = datetime(2024, 3, 9, 12, 0, tzinfo=tz)

DAY = timedelta.fromisoformat("P1D")
after = before + 1 * DAY
# matches the semantics of "1 day later"

One caveat to consider is that PT24H would be identical to P1D in terms of timedelta representation, but this time the datetime arithmetic would not be aligned with the standard.

Some days are longer than others, do leap seconds cause an issue calculating the length of a day?

As far as CPython is concerned, leap seconds do not exist :slight_smile:
There was a leap second happening at 2016-12-31 23:59:60 (so it takes 1 extra second to go from 23:59:59 until 00:00:00), but:

  • you cannot represent it in Python datetime.datetime
  • computation ignores it
from datetime import datetime, timezone

# The leap second occurred exactly between these two moments
before_leap = datetime(2016, 12, 31, 23, 59, 59, tzinfo=timezone.utc)
after_leap = datetime(2017, 1, 1, 0, 0, 0, tzinfo=timezone.utc)

# Both ignore the leap second
(after_leap - before_leap).total_seconds() # 1.0
after_leap.timestamp() - before_leap.timestamp() # 1.0

# ValueError: second must be in 0..59, not 60
datetime(2016, 12, 31, 23, 59, 60, tzinfo=timezone.utc)
1 Like

Computers nowadays use mainly the “smearing” technique to handle leap seconds, so their computer clocks always give time with seconds within range 0 to 59.

Thinking on this problem, and how timedeltas currently work, I am -1 on this whole en devour.

Python’s timedetlas can’t distinguish between absolute time units and relative time units, and they behave differently in different contexts, resulting in incorrect datetime arithmetic.

If we look at a library like whenever, which can correctly distinguish durations between the absolute time unit of hours and the relative time unit of days, we see that adding 24 hours and adding 1 day to a datetime can result in two different numbers:

>>> whenever.ZonedDateTime(2023, 3, 26, tz="Europe/Paris").add(hours=24)
ZonedDateTime("2023-03-27 01:00:00+02:00[Europe/Paris]")

>>> whenever.ZonedDateTime(2023, 3, 26, tz="Europe/Paris").add(days=1)
ZonedDateTime("2023-03-27 00:00:00+02:00[Europe/Paris]")

However, adding a Python timedelta cannot distinguish between 24 hours and 1 day, which can lead to incorrect datetime arithmetic if you don’t understand the design choices the Python datetime module makes:

# Seemingly one day and twenty four are the same fixed unit of time
>>> one_day = datetime.timedelta(days=1)
>>> twentyfour_hours = datetime.timedelta(hours=24)
>>> one_day == twentyfour_hours
True

>>> paris = zoneinfo.ZoneInfo("Europe/Paris")
>>> start = datetime.datetime(2023, 3, 26, tzinfo=paris)

# Give same answer as midnight the next day (which is an interesting choice Python makes)
>>> start + one_day
datetime.datetime(2023, 3, 27, 0, 0, tzinfo=zoneinfo.ZoneInfo(key='Europe/Paris'))
>>> start + twentyfour_hours
datetime.datetime(2023, 3, 27, 0, 0, tzinfo=zoneinfo.ZoneInfo(key='Europe/Paris'))


# Because we see that the the utc offset changes, so there are 23 hours difference (or 82800 seconds):
>>> start.utcoffset()
datetime.timedelta(seconds=3600)
>>> (start + one_day).utcoffset()
datetime.timedelta(seconds=7200)

# And asking for the difference tells you it's repred as one day, which is a correct calendar unit:
>>> (start + one_day) - start
datetime.timedelta(days=1)

# But asking for the total seconds gives you 24 hours, which is wrong, it was 23 hours
>>> ((start + one_day) - start).total_seconds()
86400.0

So we see that Python’s timedetla here acts like P1D and adds a calendar unit, but in most other places we think of a timedetla as a fixed unit of time.

A real duration object would be able to distinguish between P1D and PT24H and give you the correct calendar unit difference, or the correct fixed time difference. Conflating a timedetla with a duration will lead to further confusion and inaccuracies.

It is clear that an independent timedelta objet cannot be fully compatible with an ISO duration which sometimes depends on its actual start time and timezone.

Nevertheless, I wish Python could create a timedelta object from an ISO8601 string even with those imperfections discussed here. After all, the docs of datetime mention several times that it deals with “idealized” time/date values.

Recently I had to write a similar parser myself. During that work I had to decide how permissive the parser should be and I discovered that there exists also an extension 8601-2 allowing more flexible time duration strings. Obviously we want all ISO compliant input strings to be accepted. An interesting question is - if the input is a sane representation of a time period and the parser accepts it, but is not fully ISO compliant, should it be accepted or rejected?

1 Like

What would the result of timedelta.isoformat() look like for timedeltas greater than a month. Would it just output in days (even if greater than 31?) or a combination of months/years and days that would then not be able to be used round-trip back into timedelata.fromisoformat()?

I believe limiting the output of timedelta.isoformat() to a number of days would make most sense if only days can be read.

The problem around timedelta and ISO durations is that there are 2 concepts:

  • what they represent conceptually (and as defined in the standard)
  • how they interact with datetime objects using arithmetic

For example:

  • conceptually PT1H represent a fixed block of time (1 hour), would map correctly to timedelta(hours=1), but for datetime arithmetic this can be wrong, as adding timedelta(hours=1) sometimes adds something else when a DST transition happens(*)
  • conceptually P1D represents a non-fixed amount of time, this makes for a poor mapping to timedelta(days=1), but for datetime arithmetic it happens to work as we would want it: same clock time, next day

(*) If you want an example:

>>> tz = ZoneInfo("America/New_York")
>>> before = datetime(2024, 3, 10, 2, 0, tzinfo=tz)
>>> (before + timedelta(hours=1)).timestamp() - before.timestamp()
0.0

So here are my (current) thoughts on this:

  • you cannot represent ISO durations as timedelta objects (even seconds) and have arithmetic work the way the standard means (fixed time for seconds/hours)
  • you can still represent PT1H, PT1S as timedelta for their concept but you should warn users in the docs about issues with arithmetic
  • only working with PT1H, PT1S would make the proposal weak (IMO), for example it would not even cover the parsing done by pip for --uploaded-prior-to
  • adding P1D would require even more warnings in the docs

So perhaps in the end it would be better to discuss the introduction of a new Duration object, akin to what is done in dateutil.relativedelta.
I was targeting “small but useful” with my proposal, but perhaps a “bigger but correct” aproach is better in this case. I may also be biased, and perhaps most users will just not care about subtle DST transitions errors.

Isn’t the problem around timedelta and ISO durations simply that they are different concepts?

The timedelta class is a fixed duration, essentially a number of (micro)seconds. Whereas an ISO duration is a relative time offset, which has more complex semantics that depend on the datetime it’s being applied to[1].

This fundamentally means that timedelta and “ISO Duration” are two different classes, with sufficiently different semantics that they simply cannot be converted losslessly.

I have no difficulty with someone creating an “ISO 8601 duration” class. I don’t even object to it being added to the stdlib. But I do object to it being presented as “just another form of datetime”, and for the same reasons I’d object to someone creating functions (or classmethods) that converted ISO 8601 duration strings to timedelta values.

It’s arguable that pip and uv’s choice to use a subset of ISO 8601 format for a timedelta duration in days was a mistake. We did consider using a custom format, or even having extra options that were dedicated to relative durations in days. But practicality won out over purity, and the PnD format was chosen. However, I do not consider that to be evidence that it’s reasonable to try to force the full ISO format into the timedelta model, where it simply doesn’t fit.


  1. I haven’t checked the ISO spec. If someone can show me the part of the spec that explains how to take an ISO duration string, and convert it into a fixed number of seconds offset, I’ll retract my comments here. But I get the impression that isn’t how ISO durations are defined. ↩︎

1 Like

Sorry, I missed this final point in your post.

Yes, I think this would be best. And it should be written as a 3rd party library on PyPI while the exact semantics and behaviour is established and battle-tested. Once that’s happened, it might make sense to propose the new class for inclusion in the stdlib.

2 Likes