Hi! I need to write a programme that use urllib to read the HTML from a file, and parse the data, extracting numbers and compute the sum of the numbers in the file. I have to to find all the span tags in the file and pull out the numbers from the tag and sum the numbers.
I’ve written this:
from urllib.request import urlopen
from bs4 import BeautifulSoup
url = input('Enter - ')
html = urlopen(url, context=ctx).read()
soup = BeautifulSoup(html, “html.parser”)
tags = soup(‘span’)
for tag in tags:
sum = sum+int(tag.contents[0])
print(sum)
But when I run the programme it appears:
Traceback (most recent call last):
File “C:\Users\Izan\Documents\folder2\html2.py”, line 6, in
from bs4 import BeautifulSoup
File “C:\Users\Izan\Documents\folder2\bs4_init_.py”, line 30, in
from .builder import builder_registry, ParserRejectedMarkup
File “C:\Users\Izan\Documents\folder2\bs4\builder_init_.py”, line 4, in
from bs4.element import (
File “C:\Users\Izan\Documents\folder2\bs4\element.py”, line 8, in
from bs4.dammit import EntitySubstitution
File “C:\Users\Izan\Documents\folder2\bs4\dammit.py”, line 13, in
from html.entities import codepoint2name
ModuleNotFoundError: No module named ‘html.entities’; ‘html’ is not a package
Any ideas?