How can I enforce aware datetime objects?

Hello,

Considering the difference between aware and naive datetime objects and that by default without specifying the tzinfo argument all datetime objects are “naive”, we’re in a bit of a pickle if we want to require aware datetime objects only and if we can’t control all creations of datetime objects (because third-party packages return them).

The proper way would be to subclass stdlib’s datetime class but that doesn’t ensure that all datetime objects are aware.

We could try and monkey patch the stdlib and hijack the datetime.__new__() function but that’s a built-in:

>>> datetime.datetime.__new__
<built-in method __new__ of type object at 0x101ac3b88>

and we can't set attributes of built-in/extension type 'datetime.datetime'. More interesting conversation here.

I think the ideal solution would be a way to “configure” the datetime class such that the tzinfo’s default argument could be changed, e.g. from None to timezone.utc. Short of such a feature, what do folks here suggest to work with this pickle?

Thank you!
Jens

1 Like

This seems like an XY problem. Naïve datetime objects have their uses (in fact, there’s not a better way to represent system local times) and trying to globally make it impossible to construct them would almost certainly be a bad idea.

If you want your interface to require aware datetimes, you should probably add some enforcement mechanism into the relevant functions or constructors that must take aware datetimes.

Another option would be to create a subclass of datetime.datetime that does not allow None for tzinfo. If you use static typing, you can annotate your functions to require this subclass.

It’s probably worth noting that this is not a good thing to do unless people are actually intending to represent a datetime in UTC when they create their datetime.datetime objects. It’s kind of like saying, “Well we want all numbers to have units, so we will make the default units meters for all numbers.” If someone is using numbers to mean miles or seconds or kilometers or parsecs, you have just made the situation more confusing, not less.

I think that in the light of type annotations in Python we see that bad design of the datetime module is that it uses the same type for both timezone aware and naïve objects. They should be distinct types so that static type analysis can detect many possible mistakes in our programs.

I hope this will get fixed at some point.

(Together with deprecating and removing datetime.utcnow() and datetime.utcfromtimestamp())

Related information:

Hello, there.

As a matter of fact, an always aware date and time object is a desired thing in my company, as we deal with data coming and going from all over the world, and we wish to always store it in UTC, enabling the internationalization team to deal with date and time localization via their tools, and not be dealing with it on every backend that we write.

I’m currently working on a subclass of the datetime object that atempts to enforce the timezone awareness, by converting any naive local time to an aware utc time. But it is extremely convoluted because the base datetime class tries to deal with both situations, and recursively creates naive cls objects which then is attempted to be enforced.

For me, the most explicit and transparent solution is to make the naive datetime the de facto datetime class, and a timestamp class to be the aware datetime.

The fact that all the logics to convert a naive object into an aware object is already implemented within the datetime object and not used to provide an always aware object makes only the use case where tzinfo is not required the easy use case. The always naive is the default, which is easy. The may be naive, but can be aware is also easy. But the always required case is left ignored and convoluted.

As a matter of fact, an always aware date and time object is a desired
thing in my company, as we deal with data coming and going from all
over the world, and we wish to always store it in UTC, enabling the
internationalization team to deal with date and time localization via
their tools, and not be dealing with it on every backend that we write.

Yeah. I like to store things as UNIX timestamps for exactly this reason;
just plain old seconds-since-1970-01-01-UTC. No timezones at all.
Convert to a datetime for presentation as needed.

I’m currently working on a subclass of the datetime object that atempts to enforce the timezone awareness, by converting any naive local time to an aware utc time.

The trickiness here is knows that a given naive datetime was from your
local timezone - what if it wasn’t?

[…snip…]

Maybe you should looks at some of the alternative date/time packages on
PyPI like arrow and dateutil.

The stdlib datetime module has the historic … fiddlinesses you
describe and some things can’t be changed without breaking existing
code. It certainly annoys me that eg datetime.now() returns a naive
datetime.

My advice is to look to a third party date/time module with more
features and better defaults (not prone to giving you naive datetimes).

Cheers,
Cameron Simpson cs@cskk.id.au

I’m currently working on a subclass of the datetime object that atempts to enforce the timezone awareness, by converting any naive local time to an aware utc time. But it is extremely convoluted

We had a similar challenge and solved it by creating a subclass:

import datetime as datetime_

# Stolen from typeshed:
# https://github.com/python/typeshed/blob/c987c78077dc21ef2b1fdb11b83eeb947c3b4276/stdlib/_typeshed/__init__.pyi#L22-L24
_Self = typing.TypeVar("_Self")

class datetime(datetime_.datetime):
    def __new__(
        cls: type[_Self],
        year: int,
        month: int = None,
        day: int = None,
        hour: int = 0,
        minute: int = 0,
        second: int = 0,
        microsecond: int = 0,
        tzinfo: typing.Optional[datetime_.tzinfo] = None,
        *,
        fold: int = 0,
    ) -> typing.Any:
        """Allocate and initialize a new ``datetime`` objects."""
        return super().__new__(
            cls, year, month, day, hour, minute, second, microsecond, tzinfo or datetime_.timezone.utc
        )

and then encourage/nag everybody to use this subclass. It’s not pretty and it requires discipline, but short of introducing yet another package dependency this was the best we could come up with. (Then again, dateutil already exists in our indirect dependencies through other packages, so…)

I think the TypeVar _Self has no effect in your code because you use it just once. TypeVar is normally used at least at two places to indicate that the annotated variables have the same type.

If you want alternative to typing.Self introduced in Python 3.11, it is shown in the documentation:
https://docs.python.org/3/library/typing.html#typing.Self