PyUnicode_FromFormat add `%X`

PyUnicode_FromFormat supports %x (lowercase hex) but not %X (uppercase hex), can it be added?

I asked on core-mentorship why it isn’t included yet and @encukou answered:

I assume it just wasn’t necessary so far?
And I guess having only %x means the messages are more consistent. There
could be a note in the %X docs saying that %x is preferred.

@vstinner, who refactored PyUnicode_FromFormat into its current form, answered:

The PyUnicode_FromFormat() function doesn’t have to be feature
complete: it doesn’t replace a very generic libc printf() function. It
was written to ease writing C extensions for Python.

There is no %X format simply because nobody needed this format, or
developers found other ways to create such strings, without

So %X wasn’t intentionally omitted, it was just not needed so far. Since there are no objections to adding it I opened a feature request (PyUnicode_FromFormat(): add %X format · Issue #95849 · python/cpython · GitHub) and will submit a patch to add it.

1 Like

Yup, guess it boils down to “if you need it for a feature, add it” :‍)
The reason to add it is that one feature, rather than “many cases where it would be useful” – if those cases don’t actually end up in CPython, %X would just be maintenance burden.

Could you say more about your plans around Unicode code points?

For last few days I am working on a large patch which adds many features in PyUnicode_FromFormat(), including %X (also %o, %lx, %jd, %#x, %-8s, %ls, %*.*s, etc). I’m in no hurry because I want to merge your gh-95504: Fix sign placement in PyUnicode_FromFormat by philg314 · Pull Request #95505 · python/cpython · GitHub first. But keep in mind that %X is only small (and trivial) part of it.

Sure! :slight_smile:
Unicode has names for ranges of characters called blocks. For example the first block is “Basic Latin” U+0000..U+007F. I’d like to add them to unicodedata, this has already been proposed in Add block info to unicodedata · Issue #66802 · python/cpython · GitHub in 2014 and it seems like it would be accepted. That issue is actually what motivated me to start contributing.

Sounds good! I won’t submit a patch then. The fix for negative numbers was merged today in gh-95504: Fix negative numbers in PyUnicode_FromFormat by encukou · Pull Request #95848 · python/cpython · GitHub.