Good work! One background concern, which may well be ignorable.
>>> numpy.array([1j], dtype="D")
array([0.+1.j])
>>> numpy.array([1j], dtype="Zd")
Traceback (most recent call last):
File "<python-input-10>", line 1, in <module>
TypeError: data type 'Zd' not understood```
That is, numpy already uses ‘D’ for this purpose, and does not accept ‘Zd’ in this context. But numpy arrays aren’t Python array.arrays, and for all I know heavy numpy users (which I’m not) may well not expect them to work alike in any way.
So if no “numpy people” chime in, by all means ignore thiis.
And it seems, NumPy people are not willing to mitigate incompatibilities.
Give them time.
To be fair, the comment made was “After a first look at this: probably not.”
The CPython developers have had more time to struggle with the choices and implications…
I would not be surprised if, in the process of adding 'Ze’and 'Zbf16’(or similar), NumPy developers decide there’s no harm in numpy.array() accepting dtype=’Zf’ and dtype=’Zd’ as well. We shall see. Anyway, Tim’s example of numpy.array([1j], dtype="Zd")not being understood by NumPy can be changed in NumPy if “numpy people” wish it. As Petr pointed out in the SC issue, it’s always possible to accept both values (e.g., both 'Zd’ and 'D’) but only one of them can be chosen when providing a format string. I’m very happy to see that everyone has converged on providing the PEP 3118 format strings. Becoming more liberal in what to accept in a user-facing API can happen as needed/desirable.
I’m looking forward to @ngoldbaum’s PEP and to general documentation improvements!
I’m also genuinely impressed that people here argued for their point of view passionately and based on technical merits and that major changes/improvements were made despite the time pressure. In other places, unrelated to Python and which I won’t name, I’ve seen less effective and less professional levels of teamwork. And, perhaps worse, places were nobody cares.
Well, the main issue is the buffer protocol and there we would certainly add support to import/consume D if Python used it for array.array.
So do we want to allow np.array([1], dtype="Zd")? Maybe, but I am not convinced yet, the only reason would be because Python uses it and users get really confused by it not working.
Otherwise, np.array([1], dtype="complex128"/np.complex128) is nicer anway. We have a much more niche short-hand np.zeros(10, dtype="i,D") for structured dtypes that currently only supports the single character codes.
That doesn’t match struct module syntax though, but sure if people ask for it I don’t care just adding Zd as a supported alias (but I could also see a np.dtype.from_struct() or so to support the identical syntax).
For us it would be adding a 4th (or so) spelling and transitioning users away from D seems not remotely enticing. So yeah, that is not an easy choice to just say “sure adding that is clearly a good API addition”.
Which suggests to me that we (CPython) would be best off leaving complex types out of array.array entirely until coordinated PEP/NEPs settle on a coherent vision. I don’t recall people asking for complex types in array.array, it was more a “purity” thing.
Which was news to me! I didn’t realize dtype supported all-but-self-evident names too. That’s what I’d use.
ctypes doesn’t “type codes”, it uses classes, like ctypes.c_double_complex. It doesn’t accept “type codes” as input, although it exposes them via an instance’s _type_ attribute:
>>> import ctypes
>>> x = ctypes.c_double(3.14)
>>> x
c_double(3.14)
>>> x._type_
'd'
I don’t really case about multiple spellings for the same thing. For a dirt simple example, there are it least 6 ways to spell the ASCII character ‘a’ in a Python string literal:
‘a’
octal escape - \141
hex escape - ‘\x61’
“short” Unicode escape - ‘\u0061’
“long” Unicode escape - ‘\U00000061’
named Unicode escape - ‘\N{LATIN SMALL LETTER A}’
Such things shouldn’t proliferate “beyond reason”, but backward compatibility and interoperability with major adjacent software are reasons enough.
Well, alternatively we could add ‘F’/‘D’ as aliases for struct/array. That means using 'Zf'/'Zd' for array.typecode’s, just accept a different spelling as input. For the struct module that means removing these types from the table (to notes), but without deprecation.
Though, I would prefer to keep just one variant. Maybe we should wait here for users feedback first.
I hope that Python could be better match NumPy conventions.
This might look too conservative now, but I still support this.
Recent changes are useful (at least for me), but much less than complex types in the ctypes. So, removal of 3.15 changes (array/memoryview) and deprecation of the struct changes from 3.14 — seems to be a good, safe option. Maybe PEP authors could find a better way to deal with NumPy incompatibilities.
PS: Meanwhile, I opened the docs issue about current implementation of the PEP 3118 in the CPython.