PyUnicode_FromFormat supports %x (lowercase hex) but not %X (uppercase hex), can it be added?
I asked on core-mentorship why it isn’t included yet and @encukou answered:
I assume it just wasn’t necessary so far?
And I guess having only %x means the messages are more consistent. There
could be a note in the %X docs saying that %x is preferred.
@vstinner, who refactored PyUnicode_FromFormat into its current form, answered:
The PyUnicode_FromFormat() function doesn’t have to be feature
complete: it doesn’t replace a very generic libc printf() function. It
was written to ease writing C extensions for Python.
There is no %X format simply because nobody needed this format, or
developers found other ways to create such strings, without
PyUnicode_FromFormat().
Yup, guess it boils down to “if you need it for a feature, add it” :)
The reason to add it is that one feature, rather than “many cases where it would be useful” – if those cases don’t actually end up in CPython, %X would just be maintenance burden.
Could you say more about your plans around Unicode code points?
Sure!
Unicode has names for ranges of characters called blocks. For example the first block is “Basic Latin” U+0000..U+007F. I’d like to add them to unicodedata, this has already been proposed in Add block info to unicodedata · Issue #66802 · python/cpython · GitHub in 2014 and it seems like it would be accepted. That issue is actually what motivated me to start contributing.