How much Cython will benefit by optimizations in next Python releases?

Hi, I’m using Cython to develop a library which makes a lot of conversions from and to Python objects (i.e. I receive a list in input, create a typed memory view from it, do some calculations, and reconvert back to a list). Sometimes I get good speed-up even with the conversions overhead, but other times I get “only” something like 1.5x (in respect to 3.10). My understanding of Cython and Python internals is quite limited, but from what I understand:

  • in 3.11, Python devs specialized the bytecode, and this specializations should go forward in 3.12;
  • Cython doesn’t use the bytecode, but talks directly to the python/C api, so it shouldn’t benefit from bytecode optimizations;

I don’t know if it is possible to do a prediction about this but I’d like to ask if the Python/C API will be targeted as something to optimize in next releases? (will the JIT compiler that maybe will be introduced improve also the Python/C API?) This is relevant to me because the conversions costs a lot in my case, so if they ended-up slower in next releases in comparison to use Python directly, maybe I should change my optimization plan. Can something be said on this topic?

My guess is that the only answer we can give is “you will have to try it and see what happens”.

2 Likes

The Cython devs might have an idea – but I think you have the basics right. I suspect that the optimizations being developed will have litte to no impact on Cython code.

Frankly, if you are only getting a 1.5X or so speed up, then Cython is not the best tool for the job, or you are not using is optimally.

You really should give numpy a try – Cython is developed very much with numpy in mind (that’s why it was forked from Pyrex) . The really cool thing about numpy arrays is that they are both a nifty ND array object for python, and a lightweight wrapper around a C (or Fortran) array. This lets Cython work with numpy arrays with very litte overhead of converting to/from Python objects.

If there’s really a good reason not ue numpy, try using an array.array instead of a list – also very good performance with Cython.

HTH

1 Like

I gave my perspective on this on the Cython-users mailing list (https://groups.google.com/g/cython-users/c/TC9Ktl0Ryf4) and suggested asking here because people here will know better what optimisations as planned. But I think everyone agrees that the basics are right.

I also agreed with the bit about “not converting lists” :wink:

1 Like