I posted this thread to the email@example.com mailing list to gather feedback from the IANA time zone maintainers, and I thought I would forward on some of the comments from Paul Eggert’s response, along with my responses:
I am going to update this to clarify, but I think this is mostly covered by the caching behavior described in the section on constructors, once I make explicit the assumption that the full time zone data must be eagerly read into memory at construction (rather than being implemented in terms of system calls or something of that nature). With that assumption in place, the answer is that the data is updated whenever a cache miss occurs - the first time any given zone is constructed or, depending on the implementation, the first time it is constructed after a previous version has been ejected from the cache (in the reference implementation, we use a “strong” LRU cache of 8 zones and an unbound “weak” cache, so if you construct 9 zones and hold no references to any of them, constructing the first one again will be a cache miss, and the other 8 will be a cache hit).
This does mean that if you call
ZoneInfo("America/New_York") when your installed zoneinfo data is 2019c and then you upgrade to 2020a and call
ZoneInfo("US/Eastern"), the two objects may have different behaviors, but I think this is mainly unavoidable without a pretty significant performance impact.
I have made some minor changes to the wording of the constructors text and added a section to clarify this.
Beyond the fact that I plan to ship non-“zone” files in the tzdata fallback package (and thus include the leap seconds), leap seconds are out of scope for this proposal. Python’s datetime type has no support for leap seconds currently, and other than being tracked in the same database, I think they’re at least somewhat orthogonal to the primary problem we’re solving here (a tzinfo implementation).
Leap second support is on my long list of improvements for the datetime module, so I’ll probably get around to it at some point in the future.
I have added a subsection on leap seconds to the “Rejected ideas” section
Yes, I will have to look into this. My main concern is that my hope is to try to use a time zone data source that can be managed at the system level, independent of language stack. I will admit to never having looked into the details, but I was under the impression that tzdist was something that the system would consume, rather than individual programs, is that wrong?
I also am not clear - are there public tzdist servers, or is the suggestion that we would have a Python-specific tzdist service and end users would subscribe to it for updates?
I’m mainly asking because I decided early on (on some very good advice) that effectively distributing the data is a big enough task on its own that it would bog down the initial implementation to try and handle both at once, so my goal with this is to get something that will work if you have the data, and provide a reasonable way to get the data and handle the data distribution in a separate proposal. If tzdist is consistent with a backwards compatible upgrade from a version using TZif files at some point in the future, I’m happy to put it off as, “We should look into this when we try to solve the distribution problem.” It sorta seems like it should be possible to seamlessly transition from system files to tzdist (at least depending on how strong our promises are about the tz search path, anyway).
Note: This is an open action item and I am waiting for either a response or to do a bit more digging and get the answers to this question, but I suspect that we will want to hold off on TZDIST until a later PEP.
Additionally, Matt Johnson-Pint (who works at Microsoft, though he gave me no indication that he was speaking as a representative of Microsoft) pointed me at the new ICU Time Zone API in Windows 10, and so I’ve removed the rejected “Windows native support” and added a new section under “Open issues” detailing a path forward on Windows and the remaining open questions there.