I checked how duckdb uses PyUnicode_4BYTE_DATA()
. They use it to create Unicode instance, not reading from.
Additionally, they can use PyUnicode_FromStringAndSize()
. But they don’t because it is slow.
Maybe, we need to check that PyUnicode_FromStringAndSize()
is really slow than their code and why. (UnicodeWriter? Checking lone surrogate?)