A slightly modified benchmark gives maybe a clearer picture:
#!/usr/bin/env python
import timeit
import sys
from urllib.parse import quote, unquote
UNQUOTED_STRING = '=' * 10000
QUOTED_STRING = '%3D' * 10000
def measure_quote():
return quote(UNQUOTED_STRING)
def measure_unquote():
return unquote(QUOTED_STRING)
if __name__ == '__main__':
number = int(sys.argv[1])
repeat = int(sys.argv[2])
# measure the time it takes to execute quote
quote_times = timeit.repeat(measure_quote, number=number, repeat=repeat)
best_quote_time_avg = min(quote_times) / number
# measure the time it takes to execute unquote
unquote_times = timeit.repeat(measure_unquote, number=number, repeat=repeat)
best_unquote_time_avg = min(unquote_times) / number
print(f'quote, {number} loops, best of {repeat}: {best_quote_time_avg * 1000:.3f} msec per loop, {len(QUOTED_STRING) / 1024 / 1024 / best_quote_time_avg:.2f} MB/s')
print(f'unquote, {number} loops, best of {repeat}: {best_unquote_time_avg * 1000:.3f} msec per loop, {len(UNQUOTED_STRING) / 1024 / 1024 / best_unquote_time_avg:.2f} MB/s')
This gives locally (AMD Ryzen 7 5825U):
$> python3.12 unparse_benchmark.py 1000 10
quote, 1000 loops, best of 10: 0.263 msec per loop, 108.59 MB/s
unquote, 1000 loops, best of 10: 1.068 msec per loop, 8.93 MB/s
I think this is roughly what @Rosuav was looking for – while it doesn’t directly give a ratio of unparse v/s full request time, it helps informing it. I’d posit that, for the most part, URL requests take much more than a millisecond to be satisfied, while at the same time they don’t contain such degenerate cases, so maintaining an optimised, C-written version of this function might not be worth it.
@nggit you mentioned in your opening message that you worry about speed. Be mindful of these results: these numbers are sub-MB/s, while json coding/decoding often runs at 100s of MB/s, if not GB/s, depending on the library you use. If you managed to optimise urllib.parse
to be competitive with PHP’s implementation, you’d probably be reaching the order-of-magnitude performance you’d get now with JSON; until then you’ll be paying a hefty price.