PyUnicode_FromFormat add `%X`

philg · August 2, 2022, 4:10am

PyUnicode_FromFormat supports %x (lowercase hex) but not %X (uppercase hex), can it be added?

I asked on core-mentorship why it isn’t included yet and @encukou answered:

I assume it just wasn’t necessary so far?
And I guess having only %x means the messages are more consistent. There
could be a note in the %X docs saying that %x is preferred.

philg · August 10, 2022, 9:11am

@vstinner, who refactored PyUnicode_FromFormat into its current form, answered:

The PyUnicode_FromFormat() function doesn’t have to be feature
complete: it doesn’t replace a very generic libc printf() function. It
was written to ease writing C extensions for Python.

There is no %X format simply because nobody needed this format, or
developers found other ways to create such strings, without
PyUnicode_FromFormat().

So %X wasn’t intentionally omitted, it was just not needed so far. Since there are no objections to adding it I opened a feature request (PyUnicode_FromFormat(): add %X format · Issue #95849 · python/cpython · GitHub) and will submit a patch to add it.

encukou · August 10, 2022, 9:37am

Yup, guess it boils down to “if you need it for a feature, add it” :‍)
The reason to add it is that one feature, rather than “many cases where it would be useful” – if those cases don’t actually end up in CPython, %X would just be maintenance burden.

Could you say more about your plans around Unicode code points?

storchaka · August 10, 2022, 10:50am

For last few days I am working on a large patch which adds many features in PyUnicode_FromFormat(), including %X (also %o, %lx, %jd, %#x, %-8s, %ls, %*.*s, etc). I’m in no hurry because I want to merge your gh-95504: Fix sign placement in PyUnicode_FromFormat by philg314 · Pull Request #95505 · python/cpython · GitHub first. But keep in mind that %X is only small (and trivial) part of it.

philg · August 10, 2022, 2:44pm

Sure!
Unicode has names for ranges of characters called blocks. For example the first block is “Basic Latin” U+0000..U+007F. I’d like to add them to unicodedata, this has already been proposed in Add block info to unicodedata · Issue #66802 · python/cpython · GitHub in 2014 and it seems like it would be accepted. That issue is actually what motivated me to start contributing.

Sounds good! I won’t submit a patch then. The fix for negative numbers was merged today in gh-95504: Fix negative numbers in PyUnicode_FromFormat by encukou · Pull Request #95848 · python/cpython · GitHub.

Topic		Replies	Views
Request for review: gh-113804: Support formatting floats in hexadecimal notation Core Development review-request	0	213	February 29, 2024
Un-deprecate PyUnicode_READY() for future Unicode improvement Core Development	10	1072	May 16, 2022
Would there be any interest in a mass-conversion of cpython/Lib/ to f-strings? Ideas	3	762	June 7, 2020
Puts function for PyObjects C API	4	492	September 25, 2023
Why is subprocess.list2cmdline not documented Documentation documentation	15	2156	April 7, 2023

PyUnicode_FromFormat add `%X`

Related Topics