I need to communicate my python code to an API using a 21 bytes struct that uses bitfields, where one of its fields has 64 bits and starts in the middle of another byte. Is there any way I can reach this using ctypes.Structure? If not, I’d like to leave it as a suggestion to add support for it, maybe someone else could need it. I have looked at the documentation and it says bitfields only work with c_int, which makes me think there is no support for 64 bit fields, but I don’t know if I’m missing something else.
Below I give more details on the behavior I need and what I tried to achieve it using ctypes.Structure. In the end, I left some Python code I wrote (which I guess only works for little endian architectures) to work with the API without ctypes.Structure.
The struct looks like this (in C++):
#pragma pack(1)
struct BitfieldsStruct {
uint8_t a : 4;
uint8_t b : 4;
bool c : 1;
uint8_t d : 4;
uint64_t e : 64;
uint64_t f : 64;
uint8_t g : 8;
uint8_t h : 8;
uint8_t i: 8;
uint8_t j : 3;
};
The C++ client communicates flawlessly with the API, but I also need a Python client, and I haven’t been able to do it using ctypes.Structure. I have tried setting pack to 1 and also tried different combinations of field types (setting most of them to c_int or only the problematic ones). My first attempt looks like this:
import ctypes
class BitfieldsStruct(ctypes.Structure):
_pack_ = 1
_fields_ = [
("a", ctypes.c_uint8, 4),
("b", ctypes.c_uint8, 4),
("c", ctypes.c_bool, 1),
("d", ctypes.c_uint8, 4),
("e", ctypes.c_uint64, 64),
("f", ctypes.c_uint64, 64),
("g", ctypes.c_uint8, 8),
("h", ctypes.c_uint8, 8),
("i", ctypes.c_uint8, 8),
("j", ctypes.c_uint8, 3),
]
I have tried setting all of them; only ‘c’; only ‘e’; ‘e’ and ‘f’; and ‘c’, ‘d’, ‘e’ and ‘f’ to c_int, and got no improvements. The behavior I needed was a 21 bytes struct without any gaps (not even single bit gaps). The closest I got was a 22 bytes struct with a three bit gap (and all bits after gap were dislocated). Here is some C++ code to print the struct filled with ones:
#include <cstddef>
#include <cstdint>
#include <iostream>
#include <string>
#pragma pack(1)
struct BitfieldsStruct {
uint8_t a : 4;
uint8_t b : 4;
bool c : 1;
uint8_t d : 4;
uint64_t e : 64;
uint64_t f : 64;
uint8_t g : 8;
uint8_t h : 8;
uint8_t i: 8;
uint8_t j : 3;
};
union GeneralBS {
BitfieldsStruct decoded;
std::byte encoded[sizeof(BitfieldsStruct)];
};
int main() {
GeneralBS exp;
exp.decoded.a = ~0;
exp.decoded.b = ~0;
exp.decoded.c = true;
exp.decoded.d = ~0;
exp.decoded.e = ~0ll;
exp.decoded.f = ~0ll;
exp.decoded.g = ~0;
exp.decoded.h = ~0;
exp.decoded.i = ~0;
exp.decoded.j = ~0;
std::string outp = "";
std::cout << "size " << sizeof(BitfieldsStruct) << std::endl;
for(int i=0; i<sizeof(BitfieldsStruct); i++) {
auto c = (unsigned int) exp.encoded[i];
for(int j=0; j<8; j++) {
outp += ((1 << (8-j-1)) & c) ? '1' : '0';
}
outp += ' ';
}
std::cout << outp << std::endl;
}
And this is its output:
size 21
11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111 11111111
Now here is the closest I got in Python:
import ctypes
class BitfieldsStruct(ctypes.Structure):
_pack_ = 1
_fields_ = [
("a", ctypes.c_uint8, 4),
("b", ctypes.c_uint8, 4),
("c", ctypes.c_bool, 1),
("d", ctypes.c_uint8, 4),
("e", ctypes.c_uint64, 64),
("f", ctypes.c_uint64, 64),
("g", ctypes.c_uint8, 8),
("h", ctypes.c_uint8, 8),
("i", ctypes.c_uint8, 8),
("j", ctypes.c_uint8, 3),
]
if __name__ == '__main__':
pack = BitfieldsStruct()
pack.a = ~0
pack.b = ~0
pack.c = True
pack.d = ~0
pack.e = 0xFFFFFFFFFFFFFFFF
pack.f = 0xFFFFFFFFFFFFFFFF
pack.g = ~0
pack.h = ~0
pack.i = ~0
pack.j = ~0
res = bytes(pack)
print("size", len(res))
print(" ".join(bin(b) for b in res))
And its output:
size 22
0b11111111 0b11111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b111
The result has a 3-bit gap in the second byte.
Here is the Python code I wrote as a workaround:
from typing import Dict, List, Tuple
class PackedBitfields:
_fields_: List[Tuple[str, int]]
_bytemask = (1 << 8) - 1
_field_sizes_ : Dict[str, int] = dict()
def __init__(self):
self.__setattr__("_fields_", self._fields_)
def __setattr__(self, attrname, value) -> None:
if attrname == "_fields_":
for field in value:
self._field_sizes_[field[0]] = field[1]
for field in value:
self.__setattr__(field[0], 0)
super().__setattr__(attrname, value)
elif attrname in self._field_sizes_:
fs = self._field_sizes_[attrname]
fieldmask = (1 << fs) - 1
maskedval = value & fieldmask
super().__setattr__(attrname, maskedval)
else:
super().__setattr__(attrname, value)
def __bytes__(self) -> bytes:
currbyte_rem = 8
currbyte = 0
res: List[int] = []
for field in self._fields_:
remval = getattr(self, field[0])
field_remsize = field[1]
while(field_remsize > 0):
if(currbyte_rem <= field_remsize):
curropmask = (self._bytemask) >> (8 - currbyte_rem)
maskedval = (remval & curropmask) << (8 - currbyte_rem)
currbyte |= maskedval
res.append(currbyte)
field_remsize -= currbyte_rem
remval >>= currbyte_rem
currbyte_rem = 8
currbyte = 0
continue
else:
curropmask = (self._bytemask) >> (8 - field_remsize)
maskedval = (remval & curropmask) << (8 - currbyte_rem)
currbyte |= maskedval
currbyte_rem -= field_remsize
field_remsize = 0
remval = 0
if(currbyte_rem < 8):
res.append(currbyte)
return bytes(res)
class BFStruct(PackedBitfields):
_fields_ = [
("a", 4),
("b", 4),
("c", 1),
("d", 4),
("e", 64),
("f", 64),
("g", 8),
("h", 8),
("i", 8),
("j", 3),
]
And its output (when plugged in that example code):
size 21
0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111 0b11111111
I’m using ubuntu 22.04 amd64; Python 3.8.18 and g++ 11.4.0.