I have tried a million codes and none are working to scrape this website and download and unzip the csv zipfiles. any help or direction would be appreciated!!!
https://www.ercot.com/mp/data-products/data-product-details?id=NP4-188-CD
I have tried a million codes and none are working to scrape this website and download and unzip the csv zipfiles. any help or direction would be appreciated!!!
https://www.ercot.com/mp/data-products/data-product-details?id=NP4-188-CD
How did you get the code? Did you understand it?
How did you try to use it?
Exactly what happened when you tried using it? âIt doesnât workâ does not describe a problem.
That is many separate things to do, that have nothing to do with each other. So, which part failed?
I switched and used this code
import requests, zipfile
from io import BytesIO
print(âDownloading startedâ)
#Defining the zip file URL 2022
url = âhttps://www.ercot.com/misdownload/servlets/mirDownload?doclookupId=886625668â
filename = url.split(â/â)[-1]
req = requests.get(url)
print(âDownloading Completedâ)
zipfile= zipfile.ZipFile(BytesIO(req.content))
zipfile.extractall(âJ:/Taylor/ERCOT-LNG-ABHI/Capacity Clearing Prices/2022â)
it worked, I read the tutorial so I understand what each code is trying to do
Iâm now trying to run the code so I can download multiple zipfiles rather than running it for every file address
I tried to run this script to scrape the file links but the file links didnât populate, just the other links on the page
import
requests
from
bs4
import
BeautifulSoup
url
=
'https://www.geeksforgeeks.org/'
reqs
=
requests.get(url)
soup
=
BeautifulSoup(reqs.text,
'html.parser'
)
urls
=
[]
for
link
in
soup.find_all(
'a'
):
print
(link.get(
'href'
))
I assume that you are now referring to the original url with the zips.
requests.get
just downloads the static html page, so it doesnât download what is generated (client-side in a browser) by running various js scripts. So, the returned reqs.txt is not the same what you see in a browser or web-inspector â none of the table content is there for instance. If you want to do this in Python, you need other tools than just BeautifulSoup (You basically need sth that either acts as a full-fledged browser or is able to interact with your current browser, Apparently Selenium can do this. Cannot really help you further with that, however, since I never worked with that package.)
Thank you! I will look into Selenium
I will try R as well
That would be âorâ not ârâ.
Can be done though: