See also proposed PyObject_AsObjectArray() function
See also my issue proposing to add PyObject_AsObjectArray() function which uses propoposed PyResource API to “close” the array (technically, it’s more a “view” on the array).
The implementation shows how PyResource API can be used to untrack/track an object in the GC if it was tracked, or use a different “close callback” if it was already untracked.
The point here is not to discuss if it’s good or not to untrack/track an object in the GC, it’s just to show that such API gives more freedom on what can be done when “creating” and “closing” a resource.
I’m worried that it’s too much like Py_buffer. AFAICS, the difference in use cases is that for PyResource the data format is known, and doesn’t have to be described.
Is that enough a simpler struct (which will presumably be faster, since it doesn’t have to initialize as much data)?
One change that might be worth it is to add a separate field for the argument to close_func, so that it doesn’t have to be the same as data.
Why? I imagine this will be usually implemented in one tf two ways:
Getting data from objects that happen to have an immutable copy of it in the correct format:
retrieving the PyResource increfs the exporrting object
data points to the buffer
close_func is Py_DecRef
close_arg is the object
For other objects:
retrieving the PyResource allocates a buffer and fills it
data is that new buffer
close_func is the corresponding free
close_arg is data
Another change would be adding a field for the length. It seems that in most uses the *Res functions must be paired with a corresponding “get length” call so the buffer can be used safely. And in case the exporting object doesn’t have the data in the correct format, API to get the length will be tricky to implement. (And for strings, separate length API is a footgun: you need to get the number of utf-8 bytes rather than the str length…)
Those additions would push PyResource closer toward the complexity of Py_buffer. Will we still want to maintain a parallel API for the simpler case?
A shortcoming in this design is that in some uses the data will be copied twice. This will happen if:
retrieving the PyResource must allocate a buffer
the consumer needs it in a different buffer (for example, a Py_Bytes)
But most of the uses of PyResource involve small data, like function names, so maybe it’s not worth it to overengineer for this case. Types with bigger data can always provide ad-hoc “copy into" functions.
I agree If I come back to this topic, I will consider investigating to use Py_buffer to expose an array of Python objects: PyObject**. Apparently, some projects already use Py_buffer for that!