The PR doesn’t quite explain what the buf argument is. That can change of course, but as I said, I find it rather hard to explain.
I’m not convinced that’s worth the performance hit for loops that write a byte at a time.
My preference is for passing the length explicitly when desired, not a pointer.
For example:
PyAPI_FUNC(PyObject*) PyBytesWriter_Finish(
PyBytesWriter *writer);
PyAPI_FUNC(PyObject*) PyBytesWriter_FinishWithSize(
PyBytesWriter *writer, Py_ssize_t size);
We’ve had similar APIs for years in Arrow C++ and they’ve served us well.
(note: in most cases you would let the PyBytesWriter
track the length by itself)
I understand that you want to create an API with great writing performance, but also agree with @steve.dower that we should not create a second approach to the writer concept in Python.
Here’s a version of your example using a new PyBytesWriter_WriteFillByte()
which mimics memset().
static PyObject *
byteswriter_center_example(Py_ssize_t spaces, char *str, Py_ssize_t str_size)
{
PyBytesWriter *writer = PyBytesWriter_Create(spaces * 2);
/* Write left spaces */
PyBytesWriter_WriteFillByte(writer, ' ', spaces);
/* Make room for str */
if (PyBytesWriter_Extend(writer, str_size)) {
goto error;
}
/* Write str */
PyBytesWriter_WriteBytes(writer, str, str_size);
/* Write right spaces */
PyBytesWriter_WriteFillByte(writer, ' ', spaces);
return PyBytesWriter_Finish(writer);
error:
PyBytesWriter_Discard(writer);
return NULL;
}
This doesn’t need a buf
variable at all and will still be fast.
It may also be useful to have an iterator style interface:
PyBytesWriter_WriteBytesFromIterator(writer, iter)
which takes an iterator and reads all data from the iterator into the bytes object.
A standard Python iterator would work, but may be too inefficient at the C level. Perhaps we could have a simple C level iterator API for such things or just a callback that returns a buffer and a size to copy, setting the buffer to NULL when its done (or using an integer return value for this).
My example was the most simple example possible, to make it short and easy to understand. Your proposed API seems to be too specific to be usable in the general case.
In C, passing a callback or an “iterator” is very limited. See the implementation of bytes[start:end:step]
:
source_buf = PyBytes_AS_STRING(self);
result = PyBytes_FromStringAndSize(NULL, slicelength);
if (result == NULL)
return NULL;
result_buf = PyBytes_AS_STRING(result);
for (cur = start, i = 0; i < slicelength;
cur += step, i++) {
result_buf[i] = source_buf[cur];
}
return result;
In C, there is no easy way to “pass an iterator” or “pass a generator” to replace this for
loop.
That’s why my API provides a buf
pointer that you can use to write directly into the buffer. It’s the most generic way to support any use case.
I’m not sure I follow. The API I was sketching closely follows that of the PyUnicodeWriter
in that you provide functions to write various things to the buffer, without actually exposing the buffer.
I am aware that C doesn’t have the concept of an iterator. I was actually talking about passing in a Python iterator and using this to fetch the data, or, as an alternative at the C level, a callback which gets called by the PyBytesWriter_WriteBytesFromIterator(writer, iter)
multiple times until it returns a completed state (could set the buffer to NULL or return e.g. integer 1 to indicate that it’s done). State of the C level callback would have to be managed by the caller.
Here’s your bytes example using the suggested alternative API without buf
:
source_buf = PyBytes_AS_STRING(self);
PyBytesWriter *writer = PyBytesWriter_Create(slicelength);
if (writer == NULL)
return NULL;
/* write self[start:start+slicelength] */
PyBytesWriter_WriteBytes(writer, source_buf + start, slicelength);
return PyBytesWriter_Finish(writer);
The use case of having step != 1
is fairly limited, but if you want to capture that as well, we could add e.g. PyBytesWriter_WriteBytesFromSlice(writer, src, start, stop, step)
.
Why do you want to replace it, though? The equivalent code with PyBytesWriter
would not be shorter, and it would not be faster either.
To make managing state easier for the callback, it’s probably better to add a state pointer to the call:
PyBytesWriter_WriteBytesFromIterator(writer, c_callback, c_state)
The writer would then call c_callback(&buffer, &size, c_state)
until buffer
gets set to NULL. Internally, it would monitor the available space in the writer and extend this as necessary before copying the next chunk.
Such an API would make it easy to e.g. read from databases, files or streams directly into a writer, avoiding boilerplate code.
OTOH, the already sketched API could also be used directly by the caller, but then extending the writer buffer and error handling would have to be done for each such use case.
Anyway, this is just a side idea.
The main point I wanted to make is that you don’t need to expose the internal buffer in order to have a well performing API - after all, the whole point of these writers is to hide away the implementation details and possible side effects (e.g. the bytes implementation may decide to move the internal buffer somewhere else or resize it between calls).
This would be Victor’s API without the *buf
parameter and adding the new slice and fill byte padding API:
typedef struct PyBytesWriter PyBytesWriter;
PyAPI_FUNC(void*) PyBytesWriter_Create(
PyBytesWriter **writer,
Py_ssize_t alloc);
PyAPI_FUNC(Py_ssize_t) PyBytesWriter_GetRemaining(
PyBytesWriter *writer);
PyAPI_FUNC(void*) PyBytesWriter_Extend(
PyBytesWriter *writer,
Py_ssize_t extend);
PyAPI_FUNC(void) PyBytesWriter_Discard(
PyBytesWriter *writer);
PyAPI_FUNC(PyObject*) PyBytesWriter_Finish(
PyBytesWriter *writer);
PyAPI_FUNC(int) PyBytesWriter_WriteBytes(
PyBytesWriter *writer,
const void *bytes,
Py_ssize_t size);
PyAPI_FUNC(int) PyBytesWriter_WriteFillByte(
PyBytesWriter *writer,
const void *byte,
Py_ssize_t size);
PyAPI_FUNC(int) PyBytesWriter_WriteBytesFromSlice(
PyBytesWriter *writer,
const void *src,
Py_ssize_t start,
Py_ssize_t stop,
Py_ssize_t step);
PyAPI_FUNC(int) PyBytesWriter_Format(
PyBytesWriter *writer,
const char *format,
...);
For cases where these are not enough, we could even add special APIs to get access to the current internal char *buf
, but with all the strings attached, e.g.
- a resize may render the pointer invalid
- we’d have to document which of the other APIs may render the pointer invalid (probably most to stay flexible)
- buffer overflows are easily possible
- parts of the buffer can remain uninitialized
/* Return a pointer to the current position of writer, permitting the
caller to write size bytes at that position. The writer position will
advance by size bytes after this call. Calls to other writer APIs
will render the returned pointer invalid. If the caller fails to
write new data to the claimed buffer area, the contents will remain
undefined. Use at your own risk. */
PyAPI_FUNC(void*) PyBytesWriter_WriteBuffer(
PyBytesWriter *writer,
Py_ssize_t size);
I haven’t received any replies to the proposed alternative API and Victor has closed his PR.
I still think that should have a writer for bytes as well, so would like to get some feedback on the proposal. Thanks.
I didn’t give up on the idea. But it takes a lot of time to design and implement such API. I may give a new try later.
Ok, here is a counterproposal API.
API
typedef struct PyBytesWriter PyBytesWriter;
PyAPI_FUNC(PyBytesWriter *) PyBytesWriter_Create(
Py_ssize_t prealloc);
PyAPI_FUNC(PyObject*) PyBytesWriter_Finish(
PyBytesWriter *writer);
PyAPI_FUNC(void) PyBytesWriter_Discard(
PyBytesWriter *writer);
// High level methods
PyAPI_FUNC(int) PyBytesWriter_WriteBytes(
PyBytesWriter *writer,
const void *bytes,
Py_ssize_t size);
PyAPI_FUNC(void*) PyBytesWriter_Format(
PyBytesWriter *writer,
const char *format,
...);
// Alloc/Extend API using void*
PyAPI_FUNC(void*) PyBytesWriter_Alloc(
PyBytesWriter *writer,
Py_ssize_t alloc);
PyAPI_FUNC(void*) PyBytesWriter_Extend(
PyBytesWriter *writer,
void *buf,
Py_ssize_t extend);
PyAPI_FUNC(int) PyBytesWriter_Truncate(
PyBytesWriter *writer,
void *buf);
Changes: Now only Alloc/Extend/Truncate()
functions use void*
; Finish()
has a single writer parameter.
PyBytesWriter_Extend()
extends bytes allocated by PyBytesWriter_Alloc()
and returns an updated buffer to the current position. It takes a buffer (void*
) argument which is used as the current position.
PyBytesWriter_Truncate()
sets the writer size from a buffer (void*
). It avoids having to store the buffer start which doesn’t work with PyBytesWriter_Extend()
: if you store PyBytesWriter_Alloc()
result as the start, PyBytesWriter_Extend()
can move the start (call realloc()
) making the previous start a dangling pointer (!). So well, use PyBytesWriter_Truncate()
, it just works as expected (see examples below).
Truncate()
example
PyBytesWriter_Truncate()
example, creating the string "Hello World"
. Allocate too many bytes, and then truncate the string:
PyObject*
truncate_example(void)
{
PyBytesWriter *writer = PyBytesWriter_Create(0);
if (writer == NULL) {
return NULL;
}
// Allocate too many bytes
char *buf = PyBytesWriter_Alloc(writer, 20);
if (buf == NULL) {
goto error;
}
// Write some bytes
memcpy(buf, "Hello World", strlen("Hello World"));
buf += strlen("Hello World");
// Truncate to len("Hello World") bytes
if (PyBytesWriter_Truncate(writer, buf) < 0) {
goto error;
}
return PyBytesWriter_Finish(writer);
error:
PyBytesWriter_Discard(writer);
return NULL;
}
Extend()
example
PyBytesWriter_Extend()
example, creating the “Hello World” string. PyBytesWriter_Extend()
adds bytes to PyBytesWriter_Alloc()
: it takes a void*
buffer.
PyObject*
extend_example(void)
{
PyBytesWriter *writer = PyBytesWriter_Create(0);
if (writer == NULL) {
return NULL;
}
// Allocate 10 bytes
char *buf = PyBytesWriter_Alloc(writer, 10);
if (buf == NULL) {
goto error;
}
// Write some bytes
memcpy(buf, "Hello ", strlen("Hello "));
buf += strlen("Hello ");
// Allocate 10 more bytes
buf = PyBytesWriter_Extend(writer, buf, 10);
if (buf == NULL) {
goto error;
}
// Write more bytes
memcpy(buf, "World", strlen("World"));
buf += strlen("World");
// Truncate to len("Hello World") bytes
if (PyBytesWriter_Truncate(writer, buf) < 0) {
goto error;
}
return PyBytesWriter_Finish(writer);
error:
PyBytesWriter_Discard(writer);
return NULL;
}
Alloc()
example
When the output length is known in advance, the code can be simpler. Example creating the string "abc"
:
PyObject*
alloc_example(void)
{
PyBytesWriter *writer = PyBytesWriter_Create(3);
if (writer == NULL) {
return NULL;
}
char *buf = PyBytesWriter_Alloc(writer, 3);
if (buf == NULL) {
PyBytesWriter_Discard(writer);
return NULL;
}
memcpy(buf, "abc", 3);
return PyBytesWriter_Finish(writer);
}
It looks as confusing and weird as the previous proposal. I don’t understand why you insist so much on passing pointers instead of sizes.
What happens if PyBytesWriter_Finish
is unable to create the bytes
instance?
Presumably it would return NULL
, but would the PyBytesWriter
instance still exist, or would it always be discarded?
In the Extend()
example, PyBytesWriter_Alloc
returns a pointer to the buffer.
You can write data into the buffer, advancing the pointer as you go.
You pass the final pointer into PyBytesWriter_Truncate
, telling it “this is how far I’ve got in the buffer”.
You don’t have to keep track yourself how much you’ve written or remember where you started in order to calculate that.
If it’s important “not to keep track” of those things, then PyBytesWriter
can expose API functions to query them. They can be implemented as inline
functions if you’re worried about performance.
PyByteArray_Resize
doesn’t return a pointer to the newly-allocated buffer. Nobody is complaining that they have to “keep track” of the length and pointer, because you can just call the corresponding functions/macros for that (such as PyByteArray_AsString
).
The alternative to Truncate() is to have a SetSize() function. For example, replace:
if (PyBytesWriter_Truncate(writer, ascii_data) < 0) {
goto error;
}
with:
if (PyBytesWriter_SetSize(writer, ascii_data - ascii_start) < 0) {
goto error;
}
In the simple Alloc() case, ascii_start can be created like that:
unsigned char *ascii_start = PyBytesWriter_Alloc(writer, out_len);
if (ascii_start == NULL) {
goto error;
}
unsigned char *ascii_data = ascii_start;
The problem is more the more complex Extend() case. Extend() doesn’t return the start pointer, but the current position. It means that a new function should be added. Something like:
char *start = PyBytesWriter_Start(writer);
if (PyBytesWriter_SetSize(writer, ascii_data - start) < 0) {
goto error;
}
All code creating bytes objects in the Python code base use a pointer to write to the current position. PyBytesWriter API is designed around this pointer.
For me, it seems more convenient, easy and less error-prone to have a Truncate() function which takes a pointer to the current position.
In practice, code using the PyBytesWriter API is splitted into sub-functions. It’s convenient to only have to pass a pointer to track the current position, without keeping track of the written size.
So you would prefer a PyBytesWriter_Start() + PyBytesWriter_SetSize() combination? Well, I don’t like that
Note: I would prefer to hide implementation details to be able to maybe add the API to the stable ABI later. So I would prefer to not use inline
functions.
On error, PyBytesWriter_Finish()
returns NULL
. In any case, you must not use the writer after calling PyBytesWriter_Finish()
: it’s always discarded. Documentation:
.. c:function:: PyObject* PyBytesWriter_Finish(PyBytesWriter *writer, void *buf)
Finish a :c:type:`PyBytesWriter` created by :c:func:`PyBytesWriter_Create`.
On success, return a Python :class:`bytes` object.
On error, set an exception and return ``NULL``.
The writer instance is invalid after the call.
No, you just need a FinishWithSize function.
Well, that’s your API choice.
Thanks for the updated proposal, but I still don’t understand why you insist on exposing the underlying buffer.
Just like for the Unicode writer, it should be enough to provide fast APIs for writing to the internal buffer and we can add more if needed. There should be no need to expose the internal buffer directly.
That way, we can make changes to the internals as needed in the future, e.g. a bytes implementation which applies extra locking for the buffer, to be able to write to it from multiple threads (or even interpreters).
The PyBytesWriter_WriteBuffer()
I have in my proposal really is “use at your own risk” and I would rather not have it at all.
You might say: In some cases, I may need to revert a write and go back to a previous position of the writer. We could add special APIs to cover that use case as well, e.g. PyBytesWriter_Tell()
and PyBytesWriter_Seek(new_position)
(or perhaps just a dedicated PyBytesWriter_MoveBack(number_of_bytes)
).
I quite like having access to the underlying buffer, but I tend to wrap up a lot of OS APIs that require a call to get the buffer size, which I then allocate and pass back into a subsequent call to have it be filled.
I’ve actually taken to doing these with C++ strings (or bytearray
if I’m in Cython) because it’s so easy and reliable compared to handling the allocation manually. But since they then usually get moved into a bytes object, being able to use this API to allocate an in-place buffer for me would be convenient as well as efficient.
I believe the point of this API is to optimise the creation case, when you know that no references exist outside of the local function/context/whatever. Just like was the case for the Unicode writer. If that’s not explicit, I think we should say that it’s explicit and just scope ourselves out of multithreaded writers for this API.