PyPy and cPython difference in datetime.fromisoformat

I apologize if this is not the right place for this, but I happened to notice a difference between PyPy and cPython’s datetime – a different exception if bad input is passed to datetime.fromisoformat.

I couldn’t see where to file an issue for PyPy, and I suspect that PyPy is using cPython’s plain Python version of datetime, so it may be a cPython code issue anyway.

This is tested with 3.9, as that’s what I have easy access to, and where I found the bug. Anyway, here you go:

PyPy:

Python 3.9.16 | packaged by conda-forge | (feeb267e, May 11 2023, 16:55:41)
[PyPy 7.3.11 with GCC Clang 15.0.7] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>> import datetime
>>>> datetime.datetime.fromisoformat("2023")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/chris.barker/miniconda3/envs/pypy/lib/pypy3.9/datetime.py", line 1781, in fromisoformat
    date_components = _parse_isoformat_date(dstr)
  File "/Users/chris.barker/miniconda3/envs/pypy/lib/pypy3.9/datetime.py", line 274, in _parse_isoformat_date
    if dtstr[4] != '-':
IndexError: string index out of range

cPython:

Python 3.9.16 | packaged by conda-forge | (main, Feb  1 2023, 21:42:20) 
[Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import datetime
>>> datetime.datetime.fromisoformat("2023")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Invalid isoformat string: '2023'

If someone can point me to the problematic code, I could work on a PR.

In my case, I was catching the ValueError, so PyPy broke my code :frowning:

Thanks,

-CHB

The documentation doesn’t appear to specify what exception will be raised, so it might be difficult to make the case that this is a bug.

1 Like

Web search for pypy bug tracker gets me to this Issues · PyPy / pypy · GitLab
that seems to be what you are after.

1 Like

The implementation has changed in 3.11; 3.9 and 3.10 are in security maintenance mode, so there’s no fixing this anyway.

Ironically, the IndexError is raised when checking if a ValueError should be raised, but the input is short enough to break the check’s expectations.

1 Like

Easy enough to check:

$ mv _datetime.cpython-39-x86_64-linux-gnu.so{,~}
$ python3.9 -c 'import datetime; datetime.datetime.fromisoformat("2023")'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python3.9/datetime.py", line 1762, in fromisoformat
    date_components = _parse_isoformat_date(dstr)
  File "/usr/lib/python3.9/datetime.py", line 268, in _parse_isoformat_date
    if dtstr[4] != '-':
IndexError: string index out of range

So, yep, it’s a bug inherited from CPython stdlib.

I can also reproduce it with CPython 3.10 (after moving away the .so), and I can confirm that CPython 3.11 works fine.

Given @zware 's statement, I’d say either file a bug at PyPy’s heptapod, or let me know if I should do that (I have an account). I suppose they may just pull back some fixes from 3.11.

1 Like

@mgorny: If you could file the issue, that would be great. I don’t have an account on that system. (maybe at issue number two I’ll sign up :slight_smile: )

And thanks for confirming that it’s no longer an issue with 3.11 – good to know.

1 Like

Sure, filed datetime.fromisoformat() throws "internal" IndexError on invalid string instead of ValueError (#3989) · Issues · PyPy / pypy · GitLab for it. I’ll try to remember to post here once it’s resolved.

1 Like

The much simpler route for PyPy to fix this particular error rather than trying to backport 3.11 to 3.9 (it was a rather major refactoring along with some new feature(s)) would be to change https://foss.heptapod.net/pypy/pypy/-/blob/branch/py3.9/lib-python/3/datetime.py#L274 from if dstr[4] != '-': to if dstr[4:5] != '-': (and similar for line 279).

1 Like

Fixed in PyPy via this diff, which you can apply locally:

 def _parse_isoformat_date(dtstr):
     # It is assumed that this function will only be called with a
     # string of length exactly 10, and (though this is not used) ASCII-only
+    if len(dtstr) < 10:
+        raise ValueError('isoformat expects a string of length 10')
     year = int(dtstr[0:4])
     if dtstr[4] != '-':
         raise ValueError('Invalid date separator: %s' % dtstr[4])

I am working on a new conda release of 3.9, so the fix should show up there in a week or so. Thanks for the clear reproducer.

2 Likes

@mattip: thanks for the fast response!