Add zero-copy conversion of `bytearray` to `bytes` by providing `__bytes__()`

Hello, I just found gh-139871: Add `bytearray.take_bytes([n])` to efficiently extract `by… · python/cpython@732224e · GitHub and realized so so so many of the changes I had been making on my own were being realized! This work is so so exciting!!!

I am not sure of the best venue to mention this, but I have been working for much of last year on path strings and vfs (filesystem) operations in std::{fs,path} in the rust stdlib. In particular the getdents() libc call provides much greater atomicity guarantees and has been standardized by POSIX as of 2024 Add musl and glibc bindings for getdents{,64} by cosmicexplorer · Pull Request #4522 · rust-lang/libc · GitHub and was immediately supported in musl libc.

This thread investigates the goal of zero-copy i/o, which I’m just delighted to see. There are a few avenues I’ve been investigating along these lines:

  1. First of all, the getdents() libc call introduces a very peculiar set of alignment and lifetime constraints: https://github.com/rust-lang/rust/issues/43467#issuecomment-3741642799

    • I tried to describe how the buffer provided to getdents() has very specific alignment requirements, but does not know the size nor field layout of of each entry in the buffer. I also noted that the OS writing to your memory through the syscall introduces a type of ABI.
    • cpython will have somewhat of an easier time with this using ref counting to explicitly mark those lifetimes.
    • My current attempt at this in the rust stdlib is here: Comparing rust-lang:main...cosmicexplorer:getdents-fs-read_dir · rust-lang/rust · GitHub
      • This does not quite work yet, but it demonstrates one way to handle these complex lifetimes.
  2. On the subject of buffers: readlink{,at}() is sometimes done with an allocating loop, as in the rust stdlib.

    • If you examine the spec, you can do it without any new allocations: Muti- lated in any way.
    • This logic is plugged into an allocating loop later in the file with ops::ControlFlow.
  3. SIMD operation checks in the configure script for byte scanning: Comparing python:main...cosmicexplorer:byte-set-splitting · python/cpython · GitHub

I am absolutely not yet an expert on bits and bytes and simd, but I know I can make finding SIMD instructions in our configure script extremely robust. I am actually looking to do a phd thesis on parsing and text search some day and would love to help contribute to this kind of work in any way I can.

Once again: I was overjoyed to see someone else working on this and identifying how generally useful it can be for perf. I have two specific cases (url quoting and reading directory entries) where I think we can improve perf a huge amount. I also think URL parsing can be improved in this and other ways.

Please let me know if any of that would be useful to investigate further! I have *not" yet prototyped using getdents() over readdir() in cpython (but getdents is just perfect for python coroutines!!!).

Thanks!!

1 Like