Trying to scrape and download zipfiles

I have tried a million codes and none are working to scrape this website and download and unzip the csv zipfiles. any help or direction would be appreciated!!!

https://www.ercot.com/mp/data-products/data-product-details?id=NP4-188-CD

How did you get the code? Did you understand it?

How did you try to use it?

Exactly what happened when you tried using it? “It doesn’t work” does not describe a problem.

That is many separate things to do, that have nothing to do with each other. So, which part failed?

I switched and used this code

importing necessary modules

import requests, zipfile
from io import BytesIO
print(‘Downloading started’)

#Defining the zip file URL 2022
url = ‘https://www.ercot.com/misdownload/servlets/mirDownload?doclookupId=886625668’

Split URL to get the file name

filename = url.split(‘/’)[-1]

Downloading the file by sending the request to the URL

req = requests.get(url)
print(‘Downloading Completed’)

extracting the zip file contents

zipfile= zipfile.ZipFile(BytesIO(req.content))
zipfile.extractall(‘J:/Taylor/ERCOT-LNG-ABHI/Capacity Clearing Prices/2022’)

it worked, I read the tutorial so I understand what each code is trying to do

I’m now trying to run the code so I can download multiple zipfiles rather than running it for every file address

I tried to run this script to scrape the file links but the file links didn’t populate, just the other links on the page

import requests

from bs4 import BeautifulSoup

url = 'https://www.geeksforgeeks.org/'

reqs = requests.get(url)

soup = BeautifulSoup(reqs.text, 'html.parser' )

urls = []

for link in soup.find_all( 'a' ):

print (link.get( 'href' ))

I assume that you are now referring to the original url with the zips.

requests.get just downloads the static html page, so it doesn’t download what is generated (client-side in a browser) by running various js scripts. So, the returned reqs.txt is not the same what you see in a browser or web-inspector — none of the table content is there for instance. If you want to do this in Python, you need other tools than just BeautifulSoup (You basically need sth that either acts as a full-fledged browser or is able to interact with your current browser, Apparently Selenium can do this. Cannot really help you further with that, however, since I never worked with that package.)

1 Like

Thank you! I will look into Selenium

I will try R as well

That would be “or” not “r”. :smile:

Can be done though:

1 Like