Http.client Not Honoring Host Header - Possible Bug

I believe that I’ve encountered a bug with setting the Host header within the http client. When setting headers={'Host': 'python.org'} on a conn.request() call such as conn.request('GET', 'http://enjuvjj7xheyd.x.pipedream.net/a', headers={'Host': 'python.org'}) the “host” received by the server is listed as enjuvjj7xheyd.x.pipedream.net when I would expect the host to be listed as “python.org”.

The code snippet below can reproduce the issue when run on python 3.11.1 (from the Docker image python:latest when run as docker run -it --volume $(pwd):/var/tmp python:latest /bin/bash with the code snippet below mounted into the Docker container.

#!/usr/bin/env python3

import http.client
# you will want to replace the hostname URL used below in order to test
conn = http.client.HTTPConnection('enjuvjj7xheyd.x.pipedream.net')
conn.set_debuglevel(1)
conn.request('GET', 'http://enjuvjj7xheyd.x.pipedream.net/a', headers={'Host': 'python.org'})
response = conn.getresponse()

If this should be a bug I’m happy to post to Issues · python/cpython · GitHub or if this isn’t I’m happy to do some further testing.

An HTTP/1.1 (and onward) request looks like:

 GET localpath HTTP/1.1
 Host: host
 ... other headers ...

This: RFC 2616: Hypertext Transfer Protocol -- HTTP/1.1 says:

 14.23 Host

    The Host request-header field specifies the Internet host
    and port number of the resource being requested, as obtained
    from the original URI given by the user or referring resource
    (generally an HTTP URL, as described in section 3.2.2).

That says to me that a request should dissect your
http://enjuvjj7xheyd.x.pipedream.net/a into somthing like:

 GET /a HTTP/1.1
 Host: enjuvjj7xheyd.x.pipedream.net
 ... other headers ...

even if you supplied headers={'Host':_ >'python.org'}.

What you might be able to do is prepare the request, and then modify
it with an amended Host: header. In such a scenario you might then
connect to enjuvjj7xheyd.x.pipedream.net and request /a with a
Host: header of python.org.

But consider what happens when you fetch such a thing through a proxy,
not using a CONNECT call but a direct GET call. The proxy decides
the target host from the Host: header because there’s no host part in
the GET line. There’s no way to express “connect to A but use host B”.

When I do things like this, I usually use a plain old http://B/a URL
(or, of course, https) and use A:80 (or, of course, 443) as the proxy
setting. That does the “ask for host B by connecting to A”.

I would say what you’ve discovered is surprising but not wrong; I think
it’s correct and you’ve been surprised by the handling of your
parameters. This is because there’s some ambiguity, and I’d argue the
library is doing something correct.

You may get finer control by separating preparation of the request from
making the request as suggested above.

Cheers,
Cameron Simpson cs@cskk.id.au