Add a http.fetch() API

My idea is to add a http.fetch() function in the standard library (maybe also a fetch_async()).

This would offer some syntax sugar around urllib.request very much in the vein of

The reason to create a new function in the standard library is for standardisation so that I can change the HTTP engine (particularly in order to customise authentication mechanisms, TLS, and proxies) by monkey-patching this. If I want to use urllib3 then I can do so in sitecustomize, or I could install a package that does this with the .pth mechanism.

The behaviour that is problematic is that applications/libraries that want to make HTTP requests generally do the simplest thing, usually depend on requests and use it directly, deep within some code, putting it out of reach for customisation.

Much better libraries offer flexibility by allowing the user to provide a class, maybe subclassing some kind of Transport, and they provide an implementation of it for requests. Or they expose the fact that requests is a hard dependency but they let you provide the request.Session object that contains the configuration for making HTTP requests.

By exposing a simple HTTP API in Python, I would hope that a lot of packages could drop a hard dependency on requests and just call fetch().

Then if fetch() can be customised, I as an application developer get much more power to ensure that all my libraries are aligned on how they make HTTP requests. If I want them all to use httpx (maybe because it supports HTTP/2) then that becomes much easier, maybe as easy as installing a glue package fetch-httpx. Or I could indeed use requests but supply my own Session object configured for my network environment.

Having a default HTTP client isn’t really a new idea - this is what we used to have with urllib.request.urlopen() and urllib.request.install_opener() (or indeed urllib2 in the pre-requests days). This would just lift it to a higher-level API with something closer to requests’ syntax sugar, which I think is what most people love about requests. (Even httpx copied it).

1 Like

How would this be different from simply bringing requests into the stdlib, a proposal which has come up now and then (and usually knocked back for reasons of not wanting to lock its release schedule)?

This isn’t proposing to add requests to the stdlib. The default implementation of this would be based on urllib. The point is that you can write code that only uses the stdlib, and then if you do install requests, or fetch-requests or something, then you transparently start using requests with no code changes.

Why isn’t urllib.request.urlopen(url) sufficient? Once you’ve answered that question, that’s the argument against your proposal - all the things you want http.fetch to do that urlopen doesn’t cover, would need to be implemented and supported in the stdlib. And why do that when production-quality 3rd party solutions already exist?

To put it another way, what’s the problem with depending on requests? Except in specialised situations, you likely depend on plenty of other 3rd party libraries, after all.

The problem isn’t when I depend on requests, but when one of my dependencies depends on requests internal to itself.

Then I face two problems:

  1. I can’t get essential configuration into it (proxies, auth, TLS, etc.)
  2. If my app wants to use httpx for reasons, my distribution will include both httpx and requests. It will be bigger, slower to import, and won’t support connection pooling between requests with requests and requests with httpx.

I could monkey-patch requests but that is big and I’m not confident I can do it. What I want is a small fetch() interface that I can easily replace.

I should caveat that I think requests is great. I’m not trying to avoid it as such, just consolidate all the usage of it.

That’s a problem that you could raise with requests. It’s certainly not something the stdlib should be solving…

… and that’s true of any library where there’s multiple options. Again, adding something to the stdlib won’t fix that - you’ll still have your code using http.fetch, one dependency using http.async_fetch, another using requests, another using httpx, etc.

Basically, the stdlib isn’t a way to consolidate things. It’s a place that things which are already consolidated (i.e., there’s a single, acknowledged “best of breed” library on PyPI) can go, if they want to. But even then, projects often won’t want to be in the stdlib because the constraints involved aren’t acceptable to them.

Having said all of this, there is an argument for a simple http.fetch style API in the stdlib. But it would only be for basic use, and would emphatically not be configurable, working essentially “like your browser does” by using platform-native APIs. This has been discussed a few times, but it’s hard to do (in some ways, because it wouldn’t conform to the behaviour of the existing libraries which don’t use native APIs). And it wouldn’t eliminate use of libraries like requests, which are still important for cases where the extra configurability is needed.

1 Like

There is honestly a much bigger chance of building a fetch-like API based on the OS’s native HTTP libraries (or libcurl on Linux) than wrapping something around urlllib. That way it isn’t based on urllib’s quirks, our own SSL module, etc.

1 Like