I am trying to set up a callback function for custom file reading access with a C library mapped to Python using ctypes.
The callback basically receives position and number of bytes to read, and an empty string buffer (unsigned char *) that the data shall be written into.
My question is: How can I get the read data (which are Python bytes) into the buffer passed with the callback?
So far, I’ve got the following code:
class _reader_class:
def __init__(self, buffer):
self.buffer = buffer
def __call__(self, param, position, p_buf, size):
self.buffer.seek(position)
data = self.buffer.read(size)
print(position, size, data)
# TODO assign data to p_buf
return 1
(A reader object is initialised using _reader_calss(buffer), which is then wrapped with ctypes.CFUNCTYPE(...).)
I tried to use ctypes.memmove() as suggested in several forums:
I would need the C declaration of the callback and the ctypes.CFUNCTYPE(...) definition to be certain of anything here.
In general, there’s nothing wrong with using memmove() if the source address, destination address, and size are all correct. One problem I see is that self.buffer.read(size) may read fewer than size bytes. Also, since the type of data must be bytes, you don’t need create_string_buffer(). It’s an unnecessary copy of the data. The call should be ctypes.memmove(p_buf, data, len(data)).
That said, it’s inefficient to read the data as a bytes object just to copy it to the destination buffer. If self.buffer supports the readinto() method, you can avoid the copy by creating a ctypes array that references the destination buffer. For example, if p_buf is the address of the destination buffer:
That’s unexpected. Maybe there aren’t size bytes available from position to the end of self.buffer. But maybe it’s text data, and it zeroes the buffer beforehand to handle the result as a null-terminated string. Anyway, you’d see a system error in that case due to the added assertion, assuming you’re running a debug build (i.e. __debug__ is true) instead of a release build. (Python calls the latter an ‘optimized’ build. This mixed-up terminology is confusing. Optimization is separate from compiling the debug or release version of a program.)
I don’t see a reason for the segfault due to ctypes.memmove(p_buf, data, len(data)). Enabling the faulthandler module (i.e. -X faulthandler) might help narrow it down, but you’ll probably need to use a native debugger to diagnose the problem, such as gdb in Linux or WinDbg in Windows.
For the case with ctypes.addressof(), I had stipulated that p_buf was the address of the buffer, but I hadn’t seen your ctypes definition yet. Since p_buf is an instance of ctypes.POINTER(ctypes.c_ubyte), the address of the buffer is ctypes.addressof(p_buf.contents). You mistakenly used the address of the pointer itself, ctypes.addressof(p_buf).