PEP 688: Making the buffer protocol accessible in Python

I just talked about PEP 688 at a PyCon lightning talk presenting two options for supporting typing for the buffer protocol:

  1. Adding a new __buffer__(flags) dunder method

This would work similarly to PyPy’s __buffer__ method. We’d map this dunder to the bf_getbuffer slot in Python; Python objects would implement it by returning a memoryview.

Open question: What about the bf_releasebuffer slot?

  1. Adding an __isbuffer__ = True attribute to buffer objects

This is simpler and avoids having to deal with more of the complexities of the buffer protocol. However, this behavior would be unlike any other dunder, and it may be confusing to users if they set the field on a Python class and the class doesn’t actually become a buffer.


I like option 1 best, but I’d like to make sure it works well with the C buffer protocol.

To Serhyi’s point, I think the documentation is often a bit vague about terms like “buffer” or “sequence”. (Is a “sequence” a collections.abc.Sequence, or just a class that accepts ints in __getitem__, or something in between?) I would like to restrict the term buffer to “supports the buffer protocol”, and use more precise terms for the other possibilities.

I’ve written typesheds for MicroPython (GitHub - hlovatt/PyBoardTypeshed: Typesheds (a.k.a.: interface stubs, `pyi` files, and type hints) for MicroPython.) and this would be a great help because it will allow custom buffer types. I currently use:

AnyReadableBuf: Final = TypeVar("AnyReadableBuf", bytearray, array, memoryview, bytes)

AnyWritableBuf: Final = TypeVar("AnyWritableBuf", bytearray, array, memoryview)

Which brings me to the second point that distinguishing between readonly and readwrite is common in MicroPython and would be a valuable addition.

Would also suggest names: AnyReadableBuf and AnyWritableBuf to be consistent with AnyStr.

Thanks for the feedback! I’ll continue to try to think of ways to support writability in an elegant way.

(Also, a constrained TypeVar doesn’t make sense for this use case. We can talk about this further in Discussions · python/typing · GitHub if you like.)

Sadly, I missed your lightning talk.

Is this instead of the Buffer type you’re proposing in PEP 688, or in addition?

Could you show a complete example?

We can talk about this further in Discussions · python/typing · GitHub if you like.

Is there an existing topic or are you proposing to start one. If there is a better solution than a constrained type, I’m all for it.

Please open a new topic.

This would replace current PEP 688’s types.Buffer. I’ll write out some complete explanations.

Option 1. __buffer__ special method

This will allow implementing buffer types in Python too, so it’s also a significant non-typing change.

  • Buffer types implemented in C automatically get a __buffer__ method exposed in Python. It takes a flags: int argument and returns a memoryview wrapping the Py_buffer object returned by the underlying slot.
  • flags is the same as in C, an OR of various fields documented around Buffer Protocol — Python 3.11.0a7 documentation. For convenience, perhaps we should expose those flags in the stdlib somewhere (a types.BufferFlags enum?).
  • Types implemented in Python that define a __buffer__ method automatically get it mapped to the bf_getbuffer slot. They will then be usable as buffers (e.g., they can be passed
  • Not sure yet how this affects the bf_releasebuffer slot.
  • To check for buffers in typeshed or elsewhere, we can now simply define a Protocol with def __buffer__(self, flags: int) -> memoryview: ....
  • For convenience, we can add a typing.SupportsBuffer protocol defining this method. (Or it can go into collections.abc?)
  • For backporting, we can add typing_extensions.Buffer, and we can lie in typeshed that the __buffer__ method existed before 3.12.

Some code samples:

# typeshed builtins.pyi
class bytes:
    def __buffer__(self, flags: int) -> memoryview: ...

# typeshed typing.pyi
class SupportsBuffer(Protocol):
    def __buffer__(self, flags: int) -> memoryview: ...

# user code
from typing import SupportsBuffer

def need_buffer(bf: SupportsBuffer):
     memoryview(bf)

class MyBuffer:
    def __buffer__(self, flags: int) -> memoryview: ...
        return memoryview(b"hi")

need_buffer(MyBuffer())  # works

Option 2: __isbuffer__ = True attribute

This will allow checking for buffer types through a protocol, but not defining them in Python.

  • Buffer types implemented in C automatically expose an attribute __isbuffer__ = True.
  • If a Python type sets this attribute, nothing happens, except that it’s now lying about being a buffer.
  • As with Option 1, we can use Protocols to check for buffers, and add a typing.SupportsBuffer protocol for convenience.

How useful would it be for a Python object to define a __buffer__() method except in something like a Mock?

It seems that calling b.__buffer__() does the same thing as memoryview(b) except for something with the flags. If we only cared about the runtime behavior we could just add the flags to the memoryview() constructor? It seems that the relationship between __buffer__ and memoryview is similar to that between __len__ and len().

The bf_releasebuffer slot is called by memoryview(b).release() and after that the memory view is no longer usable (most operations just give errors).

I like type checks using the presence of a method (i.e., __buffer__) better than checks for a data field (__isbuffer__).

I presume the part of PEP 688 about replacing arg: bytes with args: Buffer is also invalidated? It just doesn’t mean the same thing. If we want to do something about arg: bytes implying arg: bytes|bytearray we should think harder IMO. I could easily be convinced that memoryview doesn’t belong in that union though.

It is important that bf_releasebuffer is set to NULL in bytes and to non-NULL in mutable types. Some C code only accept types with this slot set to NULL.

Thanks, that’s an important detail that I forgot. Is it important enough to distinguish between Buffer and MutableBuffer?