I’m using code like this to make an HTTPS request to a server:
r = urllib.request.urlopen(url, data=req1msg.encode(), context=ssl_context)
I need to be able to retrieve the peer TLS certificate from the server referenced by url. I’ve found no way to access the “transport” like I’ve done with websockets or other libraries. And I’ve found no way to wrap the socket created from (r.fileno()) with ssl_context.wrap_socket, such that I could gain access to the peer certificate that way.
When I was using websockets, I could do this:
transport = websocket.reader._transport
sslsock = transport.get_extra_info('ssl_object')
cert = sslsock.getpeercert()
But how do I do this with urllib. I’m using Python3.
Though I’m kind of curious: what do you need from the cert at this level? Usually people use a special cert/key if needed to verify the remote. Beyond verification, it’s more rare that folks look at the peer cert; hence its kind of abstracted away here.
This is just as internal as what you were looking at earlier, so it’s not really much better.
The most straight-forward way to do this is, as Charles mentioned, directly using the socket and ssl modules to connect. Generally, an HTTPS request follows these steps:
Find the computer you’re talking to (usually a DNS lookup)
Establish a socket connection (TCP - same as the socket module)
Negotiate TLS. This includes notifying the computer what name you intended to talk to (not just the IP address), which can mean that you get different certs depending on the name you used - even if they resolve to the same IP address.
Send the actual HTTP request.
To get the server’s certificate, you need to get as far as step 3, but step 4 isn’t necessary. So unless there’s some specific reason for needing to get the exact certificate used for the actual request, it would be easiest to simply recreate those first three steps and then query the certificate directly.
I had wanted to use urllib. Is it possible use a combination of this method and urllib?
This really seems like something urllib should support.
And yes, it imperative to be certain the server certificate used in the TLS handshake was the one retrieved — preferably after the urllib connection is established.
Well, what you’re doing seems to be fairly unusual, so… I’d say go ahead and reach into the private attribute. Just be aware that this could break at any time, so you will have to keep an eye out for that.
In a secure application, an initiator of a TLS (HTTPS) connection may need to examine subject attributes and certificate policies to ensure the recipient possesses some security attributes.
It is the reverse of what an mTLS server may need to do to authenticate a client. It may be that simply having had the TLS layer validate the client’s cert is insufficient — further checks on the provided cert may be required. The same is true for the client — especially in a network where all nodes are both clients and servers.
Agreed, but if you need to check these things, you should be doing it BEFORE sending the HTTP request, not after. I don’t think it’s appropriate under normal circumstances to let the library do the full request, potentially including sending your information to the wrong server, and only then check the certificate.
So if you need full security, you need to be using the lower level tools and then building on top of them; but if all you need is an easy way to say “hey, what was the certificate?”, then grabbing the private attribute doesn’t seem all that wrong to me (since this is a relatively unusual situation).
So you’re suggesting to open a TLS connection to the server, do the cert checking, and then issue the HTTP request over the TLS connection?
That does make sense to me. I just don’t to make two TCP/TLS connections and perform two TLS handshakes. So how can I leverage a higher-level HTTP library/interface over an existing HTTPS connection acquired as you suggested?
Can I somehow hand the connection to urllib to send the request and receive a response (without hand coding all the HTTP protocol)?
For complete security? Yes. That’s what normally happens, and if you need to do additional checks on the cert, the correct place to do it is the same point - prior to the HTTP request.
Right, one connection and then use that for your HTTP request.
You may actually be able to implement this using any of a number of libraries, simply by providing a different SSL context. I haven’t actually done this, but there are a number of hooks in the ssl.SSLContext class that may be of use here. Most HTTPS libraries will allow you to select the SSL context (the default urllib certainly does), and then you could in fact interact with the handshake in exactly this way.
I can’t help you further than theory though, as I’ve never done this particular task.