Urllib traceback error

I am learning python and am very new to the whole thing, learning through a MOOC. This week our lesson was about scraping data from web sources. I cannot even get the most basic thing to work, getting a million traceback errors with this code. Can anyone help me understand what the error(s) is/are?

import urllib.request, urllib.parse, urllib.error

x = urllib.request.urlopen('https://py4e-data.dr-chuck.net/comments_42.html').read()

Here are the traceback errors I get when running this from Terminal
Traceback (most recent call last):
File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py”, line 1348, in do_open
h.request(req.get_method(), req.selector, req.data, headers,
File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py”, line 1282, in request
self._send_request(method, url, body, headers, encode_chunked)
File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py”, line 1328, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py”, line 1277, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py”, line 1037, in _send_output
self.send(msg)
File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py”, line 975, in send
self.connect()
File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py”, line 1454, in connect
self.sock = self._context.wrap_socket(self.sock,
File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py”, line 513, in wrap_socket
return self.sslsocket_class._create(
File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py”, line 1071, in _create
self.do_handshake()
File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py”, line 1342, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/Users/ser41/Library/CloudStorage/OneDrive-ThePennsylvaniaStateUniversity/Data Science/Python Michigan/Homework Assignments/Exercise 12.py”, line 4, in
with urllib.request.urlopen(‘Welcome to the comments assignment from www.py4e.com’) as response:
File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py”, line 216, in urlopen
return opener.open(url, data, timeout)
File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py”, line 519, in open
response = self._open(req, data)
File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py”, line 536, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py”, line 496, in _call_chain
result = func(*args)
File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py”, line 1391, in https_open
return self.do_open(http.client.HTTPSConnection, req,
File “/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py”, line 1351, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)>

It seems like you would have to provide an SSL certificate in order to receive the response otherwise the request might be considered as a threat.

Oh wow, this wasn’t covered in our class – do you have a link on how to do that?

Also, if I remove the s (which I assume has something to do with security), there is still a webpage with the info I need – if I do this, do I still need a certificate?

Thanks so much for your help!

No worries.

Removing the s won’t be helpful as the URL doesn’t exist if they use HTTPS.

And yeah, this link might be able to help you out.

1 Like

In this case, fortunately, there is a non-secured website with the info I need – but this is super helpful since I am new to everything!

Yeah, that sounds great. Have fun!