Download stock datas

fpallott · February 27, 2024, 10:12am

Hello, I’m Fabrizio, nice to meet you all. This is my first request on this forum and I hope I don’t make any mistake.

Following my problem: I wish to download some stock actions from the Italian stock exchange and to create a file with the datas of all the stock selected.

titoli = [‘A2A.MI’, ‘AMP.MI’, ‘AZM.MI’, ‘BGN.MI’, ‘BMED.MI’] #è la lista dei titoli
for titolo in titoli:
data = yf.download(titolo, period=‘10d’) #scarico da yahoo finance i dati di tutti i titoli nella lista
data.to_csv(“titoli_old.csv”) #esporta tutti i dati di tutti i titoli in un unico file
#riorganizzo le colonne del file CSV, tenendo solo quelle di mio interesse ed elimino il csv di origine
df = pd.read_csv(“titoli_old.csv”)
df = df[[‘Date’, ‘Open’, ‘Close’, ‘High’, ‘Low’, ‘Volume’]] #gli indico quali colonne voglio
df.to_csv(‘titoli.csv’, index=False) #creo il nuovo file che contiene solo le colonne di mio interesse
os.remove(“titoli_old.csv”) #elimino il vecchio file csv

Sorry, the text is in Italian, but Python is universal :)))

[100%] 1 of 1 completed
[100%] 1 of 1 completed
[100%] 1 of 1 completed
[100%] 1 of 1 completed
[100%**] 1 of 1 completed

As you can read above, it seems it does the extrapolation, but I see only the data of the last title (BMED.MI) (following), and the file created contains only the BMED.MI data.

Date Open Close High Low Volume
0 2024-02-14 9.600 9.826 9.846 9.566 1158432
1 2024-02-15 9.850 9.746 9.858 9.660 929877

Where is my mistake and waht can I do to solve it?

Thank you in advance for your help,
Fabrizio

kknechtel · February 27, 2024, 4:16pm

Please read the pinned thread to see how to format code for the forum properly, so that we can see the indentation of the code properly and understand its structure.

That said, I guess that you are trying to use pandas.to_csv multiple times for the same file. This will not work - it does not care about what is already in the file.

I’m not sure why you would want to put data about multiple stocks in the same file. It won’t make sense if you have the rows of data for A2A.MI and then new rows of data for AMP.MI etc. - that’s not how a CSV file works. You won’t know where the data ends for one stock and begins for the next stock, and would have to search through the data for a new header or something like that.

If you really do want one file, you can tell pandas.to_csv to append instead of overwriting, using mode='a' (the same way that you would if you were opening a file directly yourself). If you don’t want the extra headers, use header=False - I guess you want it every time except the first (which is also a little tricky).

Unless your data is really huge (some GB or so), you will be better off if you use Pandas to append (and filter) all the data in memory first, and only write one file after you have all the data ready. This will also be easier, if you want the stock data side by side (a new group of columns for each stock). (I say “easier”, but the other way is so difficult that I wouldn’t bother to try.) You can use pandas.concat to combine the rows, either way (in separate columns, or in more rows in the same columns).

onePythonUser · February 27, 2024, 5:12pm

format_code

kknechtel · February 27, 2024, 5:54pm

It works with ` symbols too (no need to press shift). But maybe we should have an image like that in the pinned thread…

onePythonUser · February 27, 2024, 6:06pm

Thank you.

I was unaware there was this other method.

Much obliged.

fpallott · February 27, 2024, 9:00pm

Thank you very much for your help.
actually I want to create a file for each stock. I didn’t write it right away because I wanted to solve the download problem first.
Sorry, but I’m new to the Python world and will make mistakes many times.

kknechtel · February 27, 2024, 9:18pm

Ah. Well, that’s exactly it; there isn’t a download problem, there’s a problem because it doesn’t make different files.

fpallott · February 27, 2024, 9:35pm

Yes… :(( I don’t know how to do

kknechtel · February 27, 2024, 10:37pm

Every file name can only name a single file - just like the variables in your program.
So, if you want to make a different file each time, you have to choose a different name each time. So you should come up with a rule that tells you a file name to use, based on the stock name; use some code to create the string for the file name, and use it to save the file.

fpallott · February 27, 2024, 11:42pm

ok, I’ll try, thank you very much

fpallott · February 28, 2024, 1:06am

Hello.
I wrote the following code and it created as many files as there are stock market securities. However, they all contain the same data, i.e. that of the last title in the list (ANIM.MI).
How can I solve it?
Thank you!

import pandas as pd
from pandas_datareader import data as web #I assume this is what you have
from datetime import datetime

tickers = ["A2A.MI", "AMP.MI", "ANIM.MI"]

for stock in tickers:
    stock_data = yf.download(titolo, period='20d')
    df = pd.DataFrame(stock_data)
    df.to_csv(stock + ".csv", index=False)

kknechtel · February 28, 2024, 1:49am

Think carefully about where titolo is coming from.

fpallott · February 28, 2024, 9:32am

Good morning.

I modified as follow.

titoli = ['A2A.MI', 'AMP.MI', 'AZM.MI', 'BGN.MI', 'BMED.MI'] #è la lista dei titoli
for titolo in titoli:
    data = yf.download(titolo, period='20d') 
    df.to_csv(titolo + "2.csv", mode="a")

I have one file per stock, but all the different files contains the data of A2A.MI

Thank you

Rolando · February 28, 2024, 9:53am

As Karl mentioned you need to look carefully at what you are writing. In the loop you read your data for each titolo into a variable called data. But you never use this variable. I would guess that maybe you have some data in your data frame df which never changes. That is the reason that each file has the same data. This should be enough information for you to fix the code.

I hope this helps.

Rolando · February 28, 2024, 10:14am

One more suggestion which might help your work and error finding is that before testing a new version of your code you should clear the memory of your system of all the variables. If for example you are using Jupyter notebooks you could restart the kernel and this will clear the memory. When you do that I suspect that your line

df.to_csv(titolo + "2.csv", mode="a")

will generate an error as it will say that the object df does not exist.

This will help you identify the source of your errors.

fpallott · February 28, 2024, 10:34pm

Exactly, it happened… :((

Thank you for your suggestions, now I start to study and try to solve. :))

fpallott · February 28, 2024, 10:41pm

I’ve done it! Incredible…Finally…

titoli = ['A2A.MI', 'AMP.MI', 'AZM.MI', 'BGN.MI', 'BMED.MI']
for titolo in titoli:
    data = yf.download(titolo, period='20d')
    data.to_csv(titolo + ".csv", mode="w")

Thank youuuuuu!!!