My first post! (congrats to me Sorry if this is a newbie question, but I have not found a good answer across all of the many sites I have searched.
I am using a Try / Except block – like the one below – to catch the Overflow error that occurs due to an out-of-boundary date. The boundary dates can vary widely (depending on when the company data stream started) so I am trying to use Try / Except to find the boundary and code around it.
If you run the code below with the two dates (‘1969-12-31’ = out-of-bounds date and ‘1970-1-1’ = in-bounds date), you’ll see that the yfinance.download method halts execution with the bad date rather than passing through to the “except” clause. I have tried this with many options for the except line (i.e. bare except, except OverflowError, except ValueError, etc.) Nothing seems to work.
FYI - I am using Python 3.8 with Spyder IDE 4.1.5 Any help would be very much appreciated!
Ken
import yfinance as yf
ticker='CVX'
startdate='1970-01-01' # Also run with this -> startdate='1969-12-31'
enddate='2021-12-31'
try:
hist = yf.download(ticker, start=startdate, end=enddate, progress=False)
except OverflowError:
pass
You mention “halts execution”. Do you see any output or error message?
When I run your program (using the yfinance 0.1.64 module), the yf.download appears to execute with both dates. The hist value when I feed the 1969 date is:
Open High Low Close Adj Close Volume
Date
1969-12-31 0.000000 3.226563 3.164063 3.195313 0.475258 1076800
1970-01-02 3.195313 3.265625 3.195313 3.265625 0.485716 526400
...
Thanks BowlOfRed. I am also using the 0.1.64 module of yfinance. The output I get when I run with a startdate = ‘1969-12-31’ is below:
Exception in thread Thread-7:
Traceback (most recent call last):
File "C:\Users\mspar\Anaconda3\lib\threading.py", line 932, in _bootstrap_inner
self.run()
File "C:\Users\mspar\Anaconda3\lib\threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\mspar\Anaconda3\lib\site-packages\multitasking\__init__.py", line 102, in _run_via_pool
return callee(*args, **kwargs)
File "C:\Users\mspar\Anaconda3\lib\site-packages\yfinance\multi.py", line 169, in _download_one_threaded
data = _download_one(ticker, start, end, auto_adjust, back_adjust,
File "C:\Users\mspar\Anaconda3\lib\site-packages\yfinance\multi.py", line 181, in _download_one
return Ticker(ticker).history(period=period, interval=interval,
File "C:\Users\mspar\Anaconda3\lib\site-packages\yfinance\base.py", line 139, in history
start = int(_time.mktime(
OverflowError: mktime argument out of range
Thanks for your input BowlOfRed. I did see that StackOverflow post. It really does seem to be a mystery.
For the moment, I have a temporary workaround, which I do not like (searching the “open” column of the dataset in reverse date order and looking for the first non-zero value). This strategy is likely to fail under certain circumstances as I open my research to larger datasets. So I would like to find a more elegant solution. If anyone has an idea, I would be very interested to know. Thanks again
You can’t make a function such as yfinance.download keep going by
catching the exception that halts it. By the time the exception is
caught, the function has already halted.
So your code can only continue after the download function
fails, like this:
# code here
try:
yfinance.download(data)
except OverflowError:
pass
# continue here
There’s no way to get the download function to keep going unless the
download function itself has an option to do so. Have you read the docs
to see if there is an option to have it ignore errors?
Hi Kenneth, I don’t think there is a mystery here. The Windows 10
version of mktime cannot cope with dates before 1970, and Python on
Windows inherits that limitation.
Note that even if the time is after midnight, if the timezone adjustment
pushes it back before midnight, it may still fail.
One solution is to run your code on a Mac or Linux system, which can
handle a larger range of dates.
Another solution is to limit the data to nothing earlier than 2nd
January 1970. (That gives you a 1 day buffer in case of timezone
adjustments.)
Thanks very much, Steven. Greatly appreciate that info. It does make it clear there is no easy fix. Ultimately, my current solution is exactly what you describe in your last paragraph (code below). It’s not what I would call optimal, but it does work to avoid the exception. BTW - The “mystery” to me is why Microsoft could not make a date function that manages dates from 50 years ago (or quite frankly from 200 years ago). Maybe I am missing some rational explanation for this, but it seems strange to me that – after going through Y2K – the largest software company in the world cannot process a date correctly!
if str_startdate < '1970-01-01':
str_startdate = '1970-01-01'
# end if
P.S. Sorry about the lack of proper formatting of the code. I am not familiar with how the python.org interface interprets spaces and other characters.
That is awesome, BowlOfRed. Thanks! I just made the same change to yfinance/base.py and it works fine.
Of course, I have to ask: Are there any caveats or potential compatibility problems you can think of with this approach? Will other calls to this library still work in all cases? I know you cannot give me a definitive answer, but…what is your best guess?
All this is doing is trying to parse the “start” into “seconds since the epoch” to hand to the API. time.mktime is simpler, but datetime can manage just fine. The only problem I forsee is that there might be some differences in a timezone-naive datetime object and the time.mktime result. The start might be off by the timezone offset (or not, I haven’t really checked). For the case where this just maps to a date, it probably won’t matter much.
But other than that, it shouldn’t be an issue. Not sure if the maintainers are interested in looking into this bug since most lookups probably don’t go back to 1970.
Also, there are other calls to mktime in the code. They might be triggered by different format inputs. I only patched the one call that was affected by your example. So it might fail if you use an old time in some other way.
Thanks very much, BowlOfRed. Really appreciate it. I’ve actually annotated the code, and I’ve included a link to this thread…in the event I hit any issues in the future.