Oddity in handling params in `requests.get`

My team and I noticed an oddity in the performance of a web query against a site and I was wondering if this is something with how requests.get works.

We had been doing:

query = {'product': product}
resp = requests.get(url, params=query)

but we noticed that when we changed it to be:

url = f'{url}?{"&".join(f"{k}={urllib.parse.quote(v)}" for k, v in query.items())}'
resp = requests.get(url)

we were getting faster responses from the server.

My question is - is there something about how requests processes params in a get that would impact this?

I’d hope that it was the same HTTP request. A web search says that requests has debug logging to show what its doing.
On this page search from “debug” and it shows how to turn on the debug logging in requests: Developer Interface — Requests 2.32.3 documentation

We’re sending the same parameters to the same URL. In one case we are handling the query encoding in the other requests is. AFAICT that’s the only difference but I’m curious as to why it would impact the performance.

It cannot change the performance if what you state is true.

I’d assume that there is a subtle difference and that to see that difference you need to see the debug logs from requests.

If it’s your server your are querying then you could also check its logs.

Not our server. A vendor that we’re scraping security advisories from.

In that case use the requests debug log to find the difference.

How big a difference are we talking about? What are the numbers?

1 Like

A few seconds. It’s enough for us to change our code to do the self-encoding of the query.

A few seconds per call? :anguished: How long does the plain URL version take?

1 Like

The vendor doesn’t know how to write an API. It doesn’t paginate and can return thousands of results. We made this change to lower the number of timeouts (and we use a 60 second timeout.)

My first think would be that the server is caching the response. If the server is caching based on the exact URL, manually constructing the URL might be triggering cache hits, whereas using requests.get with params could be modifying the query string in a way that prevents the cache from being utilized effectively.

3 Likes

It also depends on how they tested performance. If you first test with requests and then test with the manual one right after, but the URL winds up being the same in either case, then it will seem faster because of the cache. You’d need to do the tests in both orders with different params to make sure it’s not just “whichever one you do first is slower”.