This is already opened as BPO 35829 but I wanted to ask about it over here for discussion.
Problem Statement
The function datetime.fromisoformat()
parses a datetime in ISO-8601, format:
>>> datetime.fromisoformat('2019-08-28T14:34:25.518993+00:00')
datetime.datetime(2019, 8, 28, 14, 34, 25, 518993, tzinfo=datetime.timezone.utc)
The timezone offset in my example is +00:00
, i.e. UTC. The ISO-8601 standard (for which fromisoformat()
is presumably named) allows “Z” to be used instead of the zero offset, i.e. 2019-08-28T14:34:25.518993Z
, however fromisoformat()
cannot parse this:
>>> datetime.fromisoformat('2019-08-28T14:34:25.518993Z')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: Invalid isoformat string: '2019-08-28T14:34:25.518993Z'
Paul Ganssle (@pganssle) is the maintainer of the dateutil
library and has made numerous improvement’s to the standard library datetime
as well. (Thanks, Paul!) Paul suggested that I should post here to brainstorm possible improvements in the API.
The dateutil
library does include support for parsing the Z
suffix:
>>> from dateutil import parser
>>> parser.isoparse('2019-08-28T14:34:25.518993Z')
datetime.datetime(2019, 8, 28, 14, 34, 25, 518993, tzinfo=tzutc())
This feels like a missing battery in the standard library, especially since other systems may produce dates that end in “Z”. One big example is JavaScript. (You can run this in your browser console right now!)
>> new Date().toISOString()
"2019-08-28T14:34:25Z"
If you have a web browser (or a node.js system) that sends you an ISO-8601 date in UTC, then you can’t parse it with Python’s standard library.
The obvious workaround (that my colleagues and I have committed to muscle memory at this point) is datetime.fromisoformat(my_date.replace('Z', '+00:00'))
. This works but it is verbose and this seems like a missing battery in the standard library.
Rejected idea
Paul doesn’t want to break the existing contract:
datetime.fromisoformat() is the inverse operation of datetime.isoformat(), which is to say that every valid input to datetime.fromisoformat() is a possible output of datetime.isoformat(), and every possible output of datetime.isoformat() is a valid input to datetime.fromisoformat().
Therefore, if fromisoformat()
can parse the Z
suffix, then isoformat()
will need to emit the Z
suffix instead of +00:00
, which could create a backwards compatibility issues. But then fromisoformat()
wouldn’t be able to parse the +00:00
suffix anymore. Therefore, this idea cannot be accepted without breaking the contract.
Proposed Idea
The name fromisoformat()
is a bit unfortunate because it doesn’t handle the full ISO-8601 spec. In fact, the spec is quite broad and covers issues that don’t matter in the datetime
class such as representing dates, times, and intervals. Furthermore, the ISO spec isn’t an open standard as far as I know. (it looks like I would need to pay money to ISO if I wanted a copy to read?)
However there is a simplified standard that is open: RFC-3339. I suggest adding new methods datetime.rfcformat()
and datetime.fromrfcformat()
that implement this RFC. As a consequence, this would also allow us to parse dates ending in Z
.
Let me know thoughts on this issue. Thanks!