Created: 2020-02-22
Python-Version: 3.9
Abstract
This proposes adding a module, zoneinfo
, to provide a concrete time
zone implementation supporting the IANA time zone database. By default,
zoneinfo
will use the system's time zone data if available; if no
system time zone data is available, the library will fall back to using
the first-party package tzdata
, deployed on PyPI.
Motivation
The datetime
library uses a flexible mechanism to handle time zones:
all conversions and time zone information queries are delegated to an
instance of a subclass of the abstract datetime.tzinfo
base class.
This allows users to implement arbitrarily complex time zone rules, but
in practice the majority of users want support for just three types of
time zone: [a]
- UTC and fixed offsets thereof
- The system local time zone
- IANA time zones
In Python 3.2, the datetime.timezone
class was introduced to support
the first class of time zone (with a special datetime.timezone.utc
singleton for UTC).
While there is still no "local" time zone, in Python 3.0 the semantics
of naĆÆve time zones was changed to support many "local time"
operations, and it is now possible to get a fixed time zone offset from
a local time:
>>> print(datetime(2020, 2, 22, 12, 0).astimezone())
2020-02-22 12:00:00-05:00
>>> print(datetime(2020, 2, 22, 12, 0).astimezone()
... .strftime("%Y-%m-%d %H:%M:%S %Z"))
2020-02-22 12:00:00 EST
>>> print(datetime(2020, 2, 22, 12, 0).astimezone(timezone.utc))
However, there is still no support for the time zones described in the
IANA time zone database (also called the "tz" database or the Olson
database ). The time zone database is in the public domain and is
widely distributed ā it is present by default on many Unix-like
operating systems. Great care goes into the stability of the database:
there are IETF RFCs both for the maintenance procedures (RFC 6557)
and for the compiled binary (TZif) format (RFC 8636). As such, it is
likely that adding support for the compiled outputs of the IANA database
will add great value to end users even with the relatively long cadence
of standard library releases.
Proposal
This PEP has three main concerns:
- The semantics of the
zoneinfo.ZoneInfo
class
- Time zone data sources used
- Options for configuration of the time zone search path
Because of the complexity of the proposal, rather than having separate
"specification" and "rationale" sections the design decisions and
rationales are grouped together by subject.
The zoneinfo.ZoneInfo
class
Constructors
The initial design of the zoneinfo.ZoneInfo
class has several
constructors.
ZoneInfo(key: str)
The primary constructor takes a single argument, key
, which is a string
indicating the name of a zone file in the system time zone database (e.g.
"America/New_York"
, "Europe/London"
), and returns a ZoneInfo
constructed
from the first matching data source on search path (see the data-sources
section for more details). All zone information must be eagerly read from the
data source (usually a TZif file) upon construction, and may not change during
the lifetime of the object (this restriction applies to all ZoneInfo
constructors).
One somewhat unusual guarantee made by this constructor is that calls
with identical arguments must return identical objects. Specifically,
for all values of key
, the following assertion must always be valid
[b]:
a = ZoneInfo(key)
b = ZoneInfo(key)
assert a is b
The reason for this comes from the fact that the semantics of datetime
operations (e.g. comparison, arithmetic) depend on whether the datetimes
involved represent the same or different zones; two datetimes are in the
same zone only if dt1.tzinfo is dt2.tzinfo
. In addition to the
modest performance benefit from avoiding unnecessary proliferation of
ZoneInfo
objects, providing this guarantee should minimize surprising
behavior for end users.
dateutil.tz.gettz
has provided a similar guarantee since version 2.7.0
(release March 2018).
Note
The implementation may decide how to implement the cache behavior, but
the guarantee made here only requires that as long as two references
exist to the result of identical constructor calls, they must be
references to the same object. This is consistent with a reference
counted cache where ZoneInfo
objects are ejected when no references to
them exist (for example, a cache implemented with a weakref.WeakValueDictionary
) ā it is allowed but not required or recommended to implement this with a "strong" cache, where all ZoneInfo
files are kept alive indefinitely.
ZoneInfo.nocache(key: str)
This is an alternate constructor that bypasses the constructor's cache.
It is identical to the primary constructor, but returns a new object on
each call. This is likely most useful for testing purposes, or to
deliberately induce "different zone" semantics between datetimes with
the same nominal time zone.
Even if an object constructed by this method would have been a cache
miss, it must not be entered into the cache; in other words, the
following assertion should always be true:
>>> a = ZoneInfo.nocache(key)
>>> b = ZoneInfo(key)
>>> a is not b
ZoneInfo.from_file(fobj: IO[bytes], /, key: str = None)
This is an alternate constructor that allows the construction of a
ZoneInfo
object from any TZif byte stream. This constructor takes an
optional parameter, key
, which sets the name of the zone, for the
purposes of __str__
and __repr__
(see
Representations)
Unlike the primary constructor, this always constructs a new object.
There are two reasons that this deviates from the primary constructor's
caching behavior: stream objects have mutable state and so determining
whether two inputs are identical is difficult or impossible, and it is
likely that users constructing from a file specifically want to load
from that file and not a cache.
As with ZoneInfo.nocache
, objects constructed by this method must not
be added to the cache.
Behavior during data updates
If a source of time zone data is updated during a run of the
interpreter, it will not invalidate any caches or modify any existing
ZoneInfo
objects, but newly constructed ZoneInfo
objects should come
from the updated data source.
This means that the point at which a ZoneInfo
file is updated depends
primarily on the semantics of the caching behavior. The only guaranteed
way to get a ZoneInfo
file from an updated data source is to induce a
cache miss, either by bypassing the cache and using ZoneInfo.nocache
or by clearing the cache.
Note
The specified cache behavior does not require that the cache be lazily
populated ā it is consistent with the specification (though not
recommended) to eagerly pre-populate the cache with time zones that have
never been constructed.
String representation
The ZoneInfo
class's __str__
representation will be drawn from the
key
parameter. This is partially because the key
represents a
human-readable "name" of the string, but also because it is a useful
parameter that users will want exposed. It is necessary to provide a
mechanism to expose the key for serialization between languages and
because it is also a primary key for localization projects like CLDR
(the Unicode Common Locale Data Repository).
An example:
>>> zone = ZoneInfo("Pacific/Kwajalein")
>>> str(zone)
'Pacific/Kwajalein'
When a key
is not specified, the str
operation should not fail, but
should return the file's __repr__
:
>>> zone = ZoneInfo.from_file(f)
>>> str(zone)
'ZoneInfo.from_file(<_io.BytesIO object at ...>)'
The __repr__
for a ZoneInfo
is implementation-defined and not
necessarily stable between versions, but it must not be a valid
ZoneInfo
key.
Pickle serialization
Rather than serializing all transition data, ZoneInfo
objects will be
serialized by key, and ZoneInfo
objects constructed from raw files
(even those with a value for key
specified) cannot be pickled.
The behavior of a ZoneInfo
file depends on how it was constructed:
-
ZoneInfo(key)
: When constructed with the primary constructor, a
ZoneInfo
object will be serialized by key, and when deserialized
the will use the primary constructor in the deserializing process,
and thus be expected to be the same object as other references to
the same time zone. For example, if europe_berlin_pkl
is a string
containing a pickle constructed from ZoneInfo("Europe/Berlin")
,
one would expect the following behavior:
>>> a = ZoneInfo("Europe/Berlin")
>>> b = pickle.loads(europe_berlin_pkl)
>>> a is b
True
-
ZoneInfo.nocache(key)
: When constructed from the cache-bypassing
constructor, the ZoneInfo
object will still be serialized by key,
but when deserialized, it will use the cache bypassing constructor.
If europe_berlin_pkl_nc
is a string containing a pickle
constructed from ZoneInfo.nocache("Europe/Berlin")
, one would
expect the following behavior:
>>> a = ZoneInfo("Europe/Berlin")
>>> b = pickle.loads(europe_berlin_pkl_nc)
>>> a is b
False
-
ZoneInfo.from_file(fobj, /, key=None)
: When constructed from a
file, the ZoneInfo
object will raise an exception on pickling. If
an end user wants to pickle a ZoneInfo
constructed from a file, it
is recommended that they use a wrapper type or a custom
serialization function: either serializing by key or storing the
contents of the file object and serializing that.
This method of serialization requires that the time zone data for the
required key be available on both the serializing and deserializing
side, similar to the way that references to classes and functions are
expected to exist in both the serializing and deserializing
environments. It also means that no guarantees are made about the
consistency of results when unpickling a ZoneInfo
pickled in an
environment with a different version of the time zone data.
Sources for time zone data
One of the hardest challenges for IANA time zone support is keeping the
data up to date; between 1997 and 2020, there have been between 3 and 21
releases per year, often in response to changes in time zone rules with
little to no notice (see for more details). In order to keep up to
date, and to give the system administrator control over the data source,
we propose to use system-deployed time zone data wherever possible.
However, not all systems ship a publicly accessible time zone database
ā notably Windows uses a different system for managing time zones ā
and so if available zoneinfo
falls back to an installable first-party
package, tzdata
, available on PyPI. If no system zoneinfo files are
found but tzdata
is installed, the primary ZoneInfo
constructor will
use tzdata
as the time zone source.
System time zone information
Many Unix-like systems deploy time zone data by default, or provide a
canonical time zone data package (often called tzdata
, as it is on
Arch Linux, Fedora and Debian). Whenever possible, it would be
preferable to defer to the system time zone information, because this
allows time zone information for all language stacks to be updated and
maintained in one place. Python distributors are encouraged to ensure
that time zone data is installed alongside Python whenever possible
(e.g. by declaring tzdata
as a dependency for the python
package).
The zoneinfo
module will use a "search path" strategy analogous to
the PATH
environment variable or the sys.path
variable in Python;
the zoneinfo.TZPATH
variable will be read-only (see
search-path-config for more details), ordered
list of time zone data locations to search. When creating a ZoneInfo
instance from a key, the zone file will be constructed from the first
data source on the path in which the key exists, so for example, if
TZPATH
were:
TZPATH = (
"/usr/share/zoneinfo",
"/etc/zoneinfo"
)
and (although this would be very unusual) /usr/share/zoneinfo
contained only America/New_York
and /etc/zoneinfo
contained both
America/New_York
and Europe/Moscow
, then
ZoneInfo("America/New_York")
would be satisfied by
/usr/share/zoneinfo/America/New_York
, while
ZoneInfo("Europe/Moscow")
would be satisfied by
/etc/zoneinfo/Europe/Moscow
.
At the moment, on Windows systems, the search path will default to
empty, because Windows does not officially ship a copy of the time zone
database. On non-Windows systems, the search path will default to a list
of the most commonly observed search paths. Although this is subject to
change in future versions, at launch the default search path will be:
TZPATH = (
"/usr/share/zoneinfo",
"/usr/lib/zoneinfo",
"/usr/share/lib/zoneinfo",
"/etc/zoneinfo",
)
This may be configured both at compile time or at runtime; more
information on configuration options at
search-path-config.
The tzdata
Python package
In order to ensure easy access to time zone data for all end users, this
PEP proposes to create a data-only package tzdata
as a fallback for
when system data is not available. The tzdata
package would be
distributed on PyPI as a "first party" package, maintained by the
CPython development team.
The tzdata
package contains only data and metadata, with no
public-facing functions or classes. It will be designed to be compatible
with both newer importlib.resources
access patterns and older
access patterns like pkgutil.get_data
.
While it is designed explicitly for the use of CPython, the tzdata
package is intended as a public package in its own right, and it may be
used as an "official" source of time zone data for third party Python
packages.
Search path configuration
The time zone search path is very system-dependent, and sometimes even
application-dependent, and as such it makes sense to provide options to
customize it. This PEP provides for three such avenues for
customization:
- Global configuration via a compile-time option
- Per-run configuration via environment variables
- Runtime configuration change via a
reset_tzpath
function
Compile-time options
It is most likely that downstream distributors will know exactly where
their system time zone data is deployed, and so a compile-time option
PYTHONTZPATH
will be provided to set the default search path.
The PYTHONTZPATH
option should be a string delimited by os.pathsep
,
listing possible locations for the time zone data to be deployed (e.g.
/usr/share/zoneinfo
).
Environment variables
When initializing TZPATH
(and whenever reset_tzpath
is called with
no arguments), the zoneinfo
module will use the environment variable
PYTHONTZPATH
, if it exists, to set the search path.
PYTHONTZPATH
is an os.pathsep
-delimited string which replaces
(rather than augments) the default time zone path. Some examples of the
proposed semantics:
$ python print_tzpath.py
("/usr/share/zoneinfo",
"/usr/lib/zoneinfo",
"/usr/share/lib/zoneinfo",
"/etc/zoneinfo")
$ PYTHONTZPATH="/etc/zoneinfo:/usr/share/zoneinfo" python print_tzpath.py
("/etc/zoneinfo",
"/usr/share/zoneinfo")
$ PYTHONTZPATH="" python print_tzpath.py
()
This provides no built-in mechanism for prepending or appending to the
default search path, as these use cases are likely to be somewhat more
niche. It should be possible to populate an environment variable with
the default search path fairly easily:
$ export DEFAULT_TZPATH=$(python -c \
"import os, zoneinfo; print(os.pathsep.join(zoneinfo.TZPATH))")
reset_tzpath
function
zoneinfo
provides a reset_tzpath
function that allows for changing the
search path at runtime.
def reset_tzpath(
to: Optional[Sequence[Union[str, os.PathLike]]] = None
) -> None:
...
When called with a sequence of paths, this function sets
zoneinfo.TZPATH
to a tuple constructed from the desired value. When
called with no arguments or None
, this function resets
zoneinfo.TZPATH
to the default configuration.
This is likely to be primarily useful for (permanently or temporarily)
disabling the use of system time zone paths and forcing the module to
use the tzdata
package. It is not likely that reset_tzpath
will be a
common operation, save perhaps in test functions sensitive to time zone
configuration, but it seems preferable to provide an official mechanism
for changing this rather than allowing a proliferation of hacks around
the immutability of TZPATH
.
Caution
Although changing TZPATH
during a run is a supported operation, users
should be advised that doing so may occasionally lead to unusual
semantics, and when making design trade-offs greater weight will be
afforded to using a static TZPATH
, which is the much more common use
case.
As noted in Constructors, the primary ZoneInfo
constructor employs a cache to ensure that two identically-constructed
ZoneInfo
objects always compare as identical (i.e.
ZoneInfo(key) is ZoneInfo(key)
), and the nature of this cache is
implementation-defined. This means that the behavior of the ZoneInfo
constructor may be unpredictably inconsistent in some situations when
used with the same key
under different values of TZPATH
. For
example:
>>> reset_tzpath(to=["/my/custom/tzdb"])
>>> a = ZoneInfo("My/Custom/Zone")
>>> reset_tzpath()
>>> b = ZoneInfo("My/Custom/Zone")
>>> del a
>>> del b
>>> c = ZoneInfo("My/Custom/Zone")
In this example, My/Custom/Zone
exists only in the /my/custom/tzdb
and not on the default search path. In all implementations the
constructor for a
must succeed. It is implementation-defined whether
the constructor for b
succeeds, but if it does, it must be true that
a is b
, because both a
and b
are references to the same key. It is
also implementation-defined whether the constructor for c
succeeds.
Implementations of zoneinfo
may return the object constructed in
previous constructor calls, or they may fail with an exception.
Backwards Compatibility
This will have no backwards compatibility issues as it will create a new
API.
With only minor modification, a backport with support for Python 3.6+ of
the zoneinfo
module could be created.
The tzdata
package is designed to be "data only", and should support
any version of Python that it can be built for (including Python 2.7).
Security Implications
This will require parsing zoneinfo data from disk, mostly from system
locations but potentially from user-supplied data. Errors in the
implementation (particularly the C code) could cause potential security
issues, but there is no special risk relative to parsing other file
types.
Because the time zone data keys are essentially paths relative to some
time zone root, implementations should take care to avoid path traversal
attacks. Requesting keys such as ../../../path/to/something
should not
reveal anything about the state of the file system outside of the time
zone path.
Reference Implementation
An initial reference implementation is available at
https://github.com/pganssle/zoneinfo
This may eventually be converted into a backport for 3.6+.
Rejected Ideas
Building a custom tzdb compiler
One major concern with the use of the TZif format is that it does not
actually contain enough information to always correctly determine the
value to return for tzinfo.dst()
. This is because for any given time
zone offset, TZif only marks the UTC offset and whether or not it
represents a DST offset, but tzinfo.dst()
returns the total amount of
the DST shift, so that the "standard" offset can be reconstructed from
datetime.utcoffset() - datetime.dst()
. The value to use for dst()
can be determined by finding the equivalent STD offset and calculating
the difference, but the TZif format does not specify which offsets form
STD/DST pairs, and so heuristics must be used to determine this.
One common heuristic ā looking at the most recent standard offset ā
notably fails in the case of the time zone changes in Portugal in 1992
and 1996, where the "standard" offset was shifted by 1 hour during a
DST transition, leading to a transition from STD to DST status with no
change in offset. In fact, it is possible (though it has never happened)
for a time zone to be created that is permanently DST and has no
standard offsets.
Although this information is missing in the compiled TZif binaries, it
is present in the raw tzdb files, and it would be possible to parse this
information ourselves and create a more suitable binary format.
This idea was rejected for several reasons:
- It precludes the use of any system-deployed time zone information,
which is usually present only in TZif format.
- The raw tzdb format, while stable, is less stable than the TZif
format; some downstream tzdb parsers have already run into problems
with old deployments of their custom parsers becoming incompatible
with recent tzdb releases, leading to the creation of a
"rearguard" format to ease the transition.
- Heuristics currently suffice in
dateutil
and pytz
for all known
time zones, historical and present, and it is not very likely that
new time zones will appear that cannot be captured by heuristics ā
though it is somewhat more likely that new rules that are not
captured by the current generation of heuristics will appear; in
that case, bugfixes would be required to accommodate the changed
situation.
- The
dst()
method's utility (and in fact the isdst
parameter in
TZif) is somewhat questionable to start with, as almost all the
useful information is contained in the utcoffset()
and tzname()
methods, which are not subject to the same problems.
In short, maintaining a custom tzdb compiler or compiled package adds
maintenance burdens to both the CPython dev team and system
administrators, and its main benefit is to address a hypothetical
failure that would likely have minimal real world effects were it to
occur.
Including tzdata
in the standard library by default
Although PEP 453, which introduced the ensurepip
mechanism to
CPython, provides a convenient template for a standard library module
maintained on PyPI, a potentially similar ensuretzdata
mechanism is
somewhat less necessary, and would be complicated enough that it is
considered out of scope for this PEP.
Because the zoneinfo
module is designed to use the system time zone
data wherever possible, the tzdata
package is unnecessary (and may be
undesirable) on systems that deploy time zone data, and so it does not
seem critical to ship tzdata
with CPython.
It is also not yet clear how these hybrid standard library / PyPI
modules should be updated, (other than pip
, which has a natural
mechanism for updates and notifications) and since it is not critical to
the operation of the module, it seems prudent to defer any such
proposal.
Support for leap seconds
In addition to time zone offset and name rules, the IANA time zone
database also provides a source of leap second data. This is deemed out
of scope because datetime.datetime
currently has no support for leap
seconds, and the question of leap second data can be deferred until leap
second support is added.
The first-party tzdata
package should ship the leap second data, even
if it is not used by the zoneinfo
module.
Using a pytz
-like interface
A pytz
-like () interface was proposed in PEP 431, but was
ultimately withdrawn / rejected for lack of ambiguous datetime support.
PEP 495 added the fold
attribute to address this problem, but
fold
obviates the need for pytz
's non-standard tzinfo
classes,
and so a pytz
-like interface is no longer necessary.
The zoneinfo
approach is more closely based on dateutil.tz
, which
implemented support for fold
(including a backport to older versions)
just before the release of Python 3.6.
Open Issues
Using the datetime
module
One possible idea would be to add ZoneInfo
to the datetime
module,
rather than giving it its own separate module. In the current version of
the PEP, this has been resolved in favor of using a separate module, for
the reasons detailed below, but the use of a nested submodule
datetime.zoneinfo
is also under consideration.
Arguments against putting ZoneInfo
directly into datetime
The datetime
module is already somewhat crowded, as it has many
classes with somewhat complex behavior ā datetime.datetime
,
datetime.date
, datetime.time
, datetime.timedelta
,
datetime.timezone
and datetime.tzinfo
. The module's implementation
and documentation are already quite complicated, and it is probably
beneficial to try to not to compound the problem if it can be helped.
The ZoneInfo
class is also in some ways different from all the other
classes provided by datetime
; the other classes are all intended to be
lean, simple data types, whereas the ZoneInfo
class is more complex:
it is a parser for a specific format (TZif), a representation for the
information stored in that format and a mechanism to look up the
information in well-known locations in the system.
Finally, while it is true that someone who needs the zoneinfo
module
also needs the datetime
module, the reverse is not necessarily true:
many people will want to use datetime
without zoneinfo
. Considering
that zoneinfo
will likely pull in additional, possibly more
heavy-weight standard library modules, it would be preferable to allow
the two to be imported separately ā particularly if potential "tree
shaking" distributions are in Python's future.
In the final analysis, it makes sense to keep zoneinfo
a separate
module with a separate documentation page rather than to put its classes
and functions directly into datetime
.
Using datetime.zoneinfo
instead of zoneinfo
A more palatable configuration may be to nest zoneinfo
as a module
under datetime
, as datetime.zoneinfo
.
Arguments in favor of this:
- It neatly namespaces
zoneinfo
together with datetime
- The
timezone
class is already in datetime
, and it may seem
strange that some time zones are in datetime
and others are in a
top-level module.
- As mentioned earlier, importing
zoneinfo
necessarily requires
importing datetime
, so it is no imposition to require importing
the parent module.
Arguments against this:
-
In order to avoid forcing all datetime
users to import zoneinfo
,
the zoneinfo
module would need to be lazily imported, which means
that end-users would need to explicitly import datetime.zoneinfo
(as opposed to importing datetime
and accessing the zoneinfo
attribute on the module). This is the way dateutil
works (all
submodules are lazily imported), and it is a perennial source of
confusion for end users.
This confusing requirement from end-users can be avoided using a
module-level __getattr__
and __dir__
per PEP 562, but this would
add some complexity to the implementation of the datetime
module.
This sort of behavior in modules or classes tends to confuse static
analysis tools, which may not be desirable for a library as
widely-used and critical as datetime
.
-
Nesting the implementation under datetime
would likely require
datetime
to be reorganized from a single-file module
(datetime.py
) to a directory with an __init__.py
. This is a
minor concern, but the structure of the datetime
module has been
stable for many years, and it would be preferable to avoid churn if
possible.
This concern could be alleviated by implementing zoneinfo
as
_zoneinfo.py
and importing it as zoneinfo
from within
datetime
, but this does not seem desirable from an aesthetic or
code organization standpoint, and it would preclude the version of
nesting where end users are required to explicitly import
datetime.zoneinfo
.
This PEP currently takes the position that on balance it would be best
to use a separate top-level zoneinfo
module because the benefits of
nesting are not so great that it overwhelms the practical implementation
concerns, but this still requires some discussion.
Structure of the PYTHON_TZPATH
environment variable
This PEP proposes to use a single environment variable: PYTHONTZPATH
.
This is based on the assumption that the majority of users who would
want to manipulate the time zone path would want to fully replace it
(e.g. "I know exactly where my time zone data is"), and other use
cases like prepending to the existing search path would be less common.
There are several other schemes that were considered and weakly
rejected:
-
Separate PYTHON_TZPATH
into two environment variables:
DEFAULT_PYTHONTZPATH
and PYTHONTZPATH
, where PYTHONTZPATH
would contain values to append (or prepend) to the default time zone
path, and DEFAULT_PYTHONTZPATH
would replace the default time
zone path. This was rejected because it would likely lead to user
confusion if the primary use case is to replace rather than augment.
-
Adding either PYTHONTZPATH_PREPEND
, PYTHONTZPATH_APPEND
or both,
so that users can augment the search path on either end without
attempting to determine what the default time zone path is. This was
rejected as likely to be unnecessary, and because it could easily be
added in a backwards-compatible manner in future updates if there is
much demand for such a feature.
-
Use only the PYTHONTZPATH
variable, but provide a custom special
value that represents the default time zone path, e.g.
<<DEFAULT_TZPATH>>
, so users could append to the time zone path
with, e.g. PYTHONTZPATH=<<DEFAULT_TZPATH>>:/my/path
could be used
to append /my/path
to the end of the time zone path.
This was rejected mainly because these sort of special values are
not usually found in PATH
-like variables, and it would be hard to
discover mistakes in your implementation.
One advantage to this scheme would be that it would add a natural
extension point for specifying non-file-based elements on the search
path, such as changing the priority of tzdata
if it exists, or if
native support for TZDIST were to be added to the library in the
future.
Windows support via Microsoft's ICU API
Windows does not ship the time zone database as TZif files, but as of
Windows 10's 2017 Creators Update, Microsoft has provided an API for
interacting with the International Components for Unicode (ICU) project
, which includes an API for accessing time zone data ā sourced from
the IANA time zone database.
Providing bindings for this would allow for a mostly seamless
cross-platform experience for users on sufficiently recent versions of
Windows ā even without falling back to the tzdata
package.
This is a promising area, but is less mature than the remainder of the
proposal, and so there are several open issues with regards to Windows
support:
-
None of the popular third party time zone libraries provide support
for ICU (dateutil
's native windows time zone support relies on
legacy time zones provided in the Windows Registry, which would be
unsuitable as a drop-in replacement for TZif files), so this would
need to be developed de novo in the standard library, rather than
first maturing in the third party ecosystem.
-
The most likely implementation for this would be to have TZPATH
default to empty on Windows and have a search path precedence of
TZPATH
> ICU > tzdata
, but this prevents end users from
forcing the use of tzdata
by setting an empty TZPATH
.
Two possible solutions for this are:
- Add a mechanism to disable ICU globally independent of setting
TZPATH
.
- Add a cross-platform mechanism to give
tzdata
the highest
precedence.
-
This is not part of the reference implementation and it is uncertain
whether it can be ready and vetted in time for the Python 3.9
feature freeze. It is an open question whether a failure to
implement native Windows support in 3.9 should defer the release of
zoneinfo
or if only the ICU-based Windows support should be
deferred.
Footnotes
[a]
: The claim that the vast majority of users only want a few types of
time zone is based on anecdotal impressions rather than anything
remotely scientific. As one data point, dateutil
provides many
time zone types, but user support mostly focuses on these three
types.
[b]
: The statement that identically constructed ZoneInfo
files should
be identical objects may be violated if the user deliberately clears
the time zone cache.
References