Please help getting Try / Except working for yfinance.download

Hello All:

My first post! (congrats to me :slight_smile: Sorry if this is a newbie question, but I have not found a good answer across all of the many sites I have searched.

I am using a Try / Except block – like the one below – to catch the Overflow error that occurs due to an out-of-boundary date. The boundary dates can vary widely (depending on when the company data stream started) so I am trying to use Try / Except to find the boundary and code around it.

If you run the code below with the two dates (‘1969-12-31’ = out-of-bounds date and ‘1970-1-1’ = in-bounds date), you’ll see that the yfinance.download method halts execution with the bad date rather than passing through to the “except” clause. I have tried this with many options for the except line (i.e. bare except, except OverflowError, except ValueError, etc.) Nothing seems to work.

FYI - I am using Python 3.8 with Spyder IDE 4.1.5 Any help would be very much appreciated!

Ken

import yfinance as yf

ticker='CVX'
startdate='1970-01-01' # Also run with this -> startdate='1969-12-31'
enddate='2021-12-31'

try:
    hist = yf.download(ticker, start=startdate, end=enddate, progress=False)
except OverflowError:
    pass

You mention “halts execution”. Do you see any output or error message?

When I run your program (using the yfinance 0.1.64 module), the yf.download appears to execute with both dates. The hist value when I feed the 1969 date is:

                  Open        High         Low       Close   Adj Close    Volume
Date
1969-12-31    0.000000    3.226563    3.164063    3.195313    0.475258   1076800
1970-01-02    3.195313    3.265625    3.195313    3.265625    0.485716    526400
...

Thanks BowlOfRed. I am also using the 0.1.64 module of yfinance. The output I get when I run with a startdate = ‘1969-12-31’ is below:

Exception in thread Thread-7:
Traceback (most recent call last):
  File "C:\Users\mspar\Anaconda3\lib\threading.py", line 932, in _bootstrap_inner
    self.run()
  File "C:\Users\mspar\Anaconda3\lib\threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\mspar\Anaconda3\lib\site-packages\multitasking\__init__.py", line 102, in _run_via_pool
    return callee(*args, **kwargs)
  File "C:\Users\mspar\Anaconda3\lib\site-packages\yfinance\multi.py", line 169, in _download_one_threaded
    data = _download_one(ticker, start, end, auto_adjust, back_adjust,
  File "C:\Users\mspar\Anaconda3\lib\site-packages\yfinance\multi.py", line 181, in _download_one
    return Ticker(ticker).history(period=period, interval=interval,
  File "C:\Users\mspar\Anaconda3\lib\site-packages\yfinance\base.py", line 139, in history
    start = int(_time.mktime(
OverflowError: mktime argument out of range

Looks like it is a problem running on Windows. I’m not sure what a good solution is here.

Thanks for your input BowlOfRed. I did see that StackOverflow post. It really does seem to be a mystery.

For the moment, I have a temporary workaround, which I do not like (searching the “open” column of the dataset in reverse date order and looking for the first non-zero value). This strategy is likely to fail under certain circumstances as I open my research to larger datasets. So I would like to find a more elegant solution. If anyone has an idea, I would be very interested to know. Thanks again :slight_smile:

Hi Kenneth,

You can’t make a function such as yfinance.download keep going by
catching the exception that halts it. By the time the exception is
caught, the function has already halted.

So your code can only continue after the download function
fails, like this:

# code here
try:
    yfinance.download(data)
except OverflowError:
    pass
# continue here

There’s no way to get the download function to keep going unless the
download function itself has an option to do so. Have you read the docs
to see if there is an option to have it ignore errors?

Hi Kenneth, I don’t think there is a mystery here. The Windows 10
version of mktime cannot cope with dates before 1970, and Python on
Windows inherits that limitation.

Note that even if the time is after midnight, if the timezone adjustment
pushes it back before midnight, it may still fail.

One solution is to run your code on a Mac or Linux system, which can
handle a larger range of dates.

Another solution is to limit the data to nothing earlier than 2nd
January 1970. (That gives you a 1 day buffer in case of timezone
adjustments.)

Thanks very much, Steven. Greatly appreciate that info. It does make it clear there is no easy fix. Ultimately, my current solution is exactly what you describe in your last paragraph (code below). It’s not what I would call optimal, but it does work to avoid the exception. BTW - The “mystery” to me is why Microsoft could not make a date function that manages dates from 50 years ago (or quite frankly from 200 years ago). Maybe I am missing some rational explanation for this, but it seems strange to me that – after going through Y2K – the largest software company in the world cannot process a date correctly!

if str_startdate < '1970-01-01':
    str_startdate = '1970-01-01'
# end if

P.S. Sorry about the lack of proper formatting of the code. I am not familiar with how the python.org interface interprets spaces and other characters.

You can put your code between two bactick fences (```) to preserve formatting. For example, if you type

```python
if str_startdate < '1970-01-01':
    str_startdate = '1970-01-01'
```

it will show up like this:

if str_startdate < '1970-01-01':
    str_startdate = '1970-01-01'

which preserves your indentation and adds coloring to make it easier to read for others.

You can edit your previous posts to try it out and see how it works.

There’s also a pinned topic in the Users category that should explain how the formatting works.

Awesome, J-MO thanks very much. I did try it out on my posts. Now they look at least coherent!

So there’s no reason that yfinance needs to use mktime. It could use the datetime stuff as well. Here’s an attempt to patch it.

If I modify site-packages/yfinance/base.py this way, I got your program to work. I commented out lines 139-140 and replaced them as shown:

                #start = int(_time.mktime(
                #    _time.strptime(str(start), '%Y-%m-%d')))
                start = int((_datetime.datetime.strptime(str(start),
                        "%Y-%m-%d") -
                        _datetime.datetime(1970,1,1)).total_seconds())

That is awesome, BowlOfRed. Thanks! I just made the same change to yfinance/base.py and it works fine.

Of course, I have to ask: Are there any caveats or potential compatibility problems you can think of with this approach? Will other calls to this library still work in all cases? I know you cannot give me a definitive answer, but…what is your best guess?

All this is doing is trying to parse the “start” into “seconds since the epoch” to hand to the API. time.mktime is simpler, but datetime can manage just fine. The only problem I forsee is that there might be some differences in a timezone-naive datetime object and the time.mktime result. The start might be off by the timezone offset (or not, I haven’t really checked). For the case where this just maps to a date, it probably won’t matter much.

But other than that, it shouldn’t be an issue. Not sure if the maintainers are interested in looking into this bug since most lookups probably don’t go back to 1970.

Also, there are other calls to mktime in the code. They might be triggered by different format inputs. I only patched the one call that was affected by your example. So it might fail if you use an old time in some other way.

Thanks very much, BowlOfRed. Really appreciate it. I’ve actually annotated the code, and I’ve included a link to this thread…in the event I hit any issues in the future.

I already hit a snag, and should have seen it coming. But it was easy to fix. I had to correct the code in the elif eventuality:

            elif isinstance(start, _datetime.datetime):
                # start = int(_time.mktime(start.timetuple()))
                start = int((_datetime.datetime.strptime(str(start),"%Y-%m-%d"))) # KW modified 2021-11-02
1 Like