Memoryview.index() 860x slower than bytes.index()

oscarbenjamin · November 20, 2025, 9:53pm

Sometimes profile results can be surprising but looking at the code the orders of magnitude timing difference is hardly surprising. It turns every byte one-by-one into a heap-allocated Python object just to then compare it with another Python object and deallocate. If you wanted to make this faster you would convert the input to a C type at the start and then do all the comparisons with C types. That would not be difficult but just needs more code to handle the different possible input types and element sizes.

It is acknowledged in the PR that it is “probably a bit slower” which seems a bit understated but the choice is understandable if speed was not a priority.