Nineteendo
(Nice Zombies)
September 17, 2024, 8:34pm
21
Peter Bierma:
However, I do see a separate cause for the slowdown: your version isn’t exactly the same as the built-in version. You’re not using the private API, but _json
is, so you lose any internal optimization benefits that CPython might be getting (such as PGO, as previously mentioned). Similarly, private APIs generally don’t have as many checks as the public API, so they are slightly faster.
I’m only seeing a big difference for the decoder, for which only these private functions were swapped:
#define _Py_EnterRecursiveCall Py_EnterRecursiveCall
#define _Py_LeaveRecursiveCall Py_LeaveRecursiveCall
The thing is they’re called once per 65,536 booleans in the benchmark.
I could try without them, but I kind of doubt we’ll see a big improvement:
decode
json
jsonc
unit (μs)
List of 65,536 booleans
1.00
1.45
1147.94
List of 4,096 strings
1.00
0.61
1686.16
barry-scott
(Barry Scott)
September 18, 2024, 1:21pm
22
Peter Bierma:
Trying to go through a profiler solely meant for C code seems like the wrong approach here. You’ll get all sorts of noise from the eval loop (namely, it will appear that 99% of the program is stuck in _PyEval_EvalFrameDefault
), which is going to make it very difficult to find where the slowdown actually is.
Some times that “noise” is where the clue to a problem is.
The two runs would show up that different Python APIs are being used.
But it looks like you found a difference that is important.
aisk
(AN Long)
September 22, 2024, 4:16pm
23
Another choice is py-spy
, which can profile Python code and C extension code, and works on macOS just with a simple pip install
command : GitHub - benfred/py-spy: Sampling profiler for Python programs
P403n1x87
(Gabriele N. Tornetta)
September 29, 2024, 8:55pm
24