Curious: access to `datetime` internals

I was teaching someone about how the __new__ method is used in python code by showing them the datetime module, and I came across a behavior that I was surprised by:

>>> from datetime import datetime
>>> foo = datetime.now()
>>> d.day
28
>>> d._day
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'datetime.datetime' object has no attribute '_day'

From reading the source code of the datetime module, I know that day is a property that reads the data stored on the _day attribute.

My surprise is that when I implement the same pattern in my own classes, I can access the variables with a single underscore.

With this behavior, I would expect to see a customized __getattribute__ method, but there are none to be found in the module.

Does this have to do with an underlying implementation in C code, or is there something additional going on here that I have not yet discovered?

2 Likes

There’s a C implementation of the module yes. At the end of the Python module, there’s a star-import overriding the entire module.

2 Likes

Yes, it has to do with the C implementation. Specifically, the
datetime.datetime class, when written in C, implements day as a “getset
descriptor”:

type(datetime.datetime.day)
# returns <class 'getset_descriptor'>

Presumably, the C implementation has some equivalent to a private field
where the day is stored, but unlike pure Python classes, C classes can
hide such private attributes from Python code.

To be more precise, C fields are always hidden from Python unless the
class explicitly makes them visible.

2 Likes

Thanks, @TeamSpen210! I glazed over that section of code. :slight_smile: It’s good to know the _datetime module is the C implementation. I had been wondering how to know when something is implemented in C vs. python.

And thank you, @steven.daprano, for the deeper explanation. That makes so much sense.

Thank you both for your quick replies!

The usual convention is indeed to have a blah module with a _blah extension module, but it’s not required. A better way to check is to import that module, then look at its repr() - it’ll say the filename is (built-in) for a few core modules like sys and math compiled directly into the interpreter, have file extensions of .so or .pyd for other extension modules in seperate files, or .py or .pyc for regular Python code.

math is not compiled directly in the interpreter (but sys is).

>>> import math
>>> math
<module 'math' from 
'/usr/local/lib/python3.10/lib-dynload/math.cpython-310-x86_64-linux-gnu.so'>

Another way to check is to look at the module’s __file__ attribute, if
it exists. It will tell you where the module is. If the attribute
doesn’t exist, it probably means it is built-in.

In the case of datetime, this doesn’t help, because the datetime
module is a .py file, it is only the content (at least, some of the
content) which is replaced by objects from a C library.

>>> import datetime
>>> datetime
<module 'datetime' from '/usr/local/lib/python3.10/datetime.py'>

In a case like this, I think that the only reliable way to tell that the
datetime.datetime class has been replaced is to read the source code.

Thanks for those hints! I’m sure they’ll be beneficial for me in the coming months.