The docs for the
struct module state that the conversion between C and Python values should be obvious given their types. This is not quite true for the
"?" format, i.e. native _Bool.
I find the docs confusing (which may be due to references to the C99 standard that I don’t have access to). I’d like to make them clearer (and possibly adjust implementation to match), but for that I need a solid understanding the intent.
(This was previously discussed in bpo-39689, whose scope is only fixing tests that broke with clang 10.)
In one place the docs say:
'?'conversion code corresponds to the
_Booltype defined by C99. If this type is not available, it is simulated using a
As far as I know, the C99
_Bool only has two valid values, 0 and 1. Something like
char c=2; _Bool b=*(_Bool*)(&c); is undefined behavior. For
struct, this could mean that
b'\x02' is an incorrectly packed
? struct and anyone unpacking it should expect undefined behavior. This is what the current implementation does.
Until recently, all tested compilers used the same semantics as below in this case, but that is changing with Clang 10.
Elsewhere the struct docs say:
'?'format character, the return value is either
False. When packing, the truth value of the argument object is used. Either 0 or 1 in the native or standard bool representation will be packed, and any non-zero value will be
This is spelled out quite clearly, but may be read as contradicting the above quote.
It may also be non-trivial to implement correctly, as Stefan Krah mentions in a bpo-39689 comment:
You could determine sizeof(_Bool), use the matching unsigned type,
unpack as that, then cast to _Bool. But do you really want to force
that procedure on all array libraries that want to be PEP-3118
So, I see three possibilities for
- it triggers undefined behavior: in practice it gives True with some compilers and False on others
- it is an incorrectly packed array: CPython will helpfully always give True to avoid UB, but other libraries are free to do anything
- it is defined to be True
And two possibilities for
struct.pack("?", x), which are equivalent in practice but I don’t know if C99 guarantees it:
So the questions are: What do the docs mean? What do implementers of PEP-3118-compatible libraries think they mean? And of course, what should be done?