Obj.dir() != dir(obj)

Monarch · December 26, 2024, 2:00pm

PEP 562: “The __dir__ function should accept no arguments, and return a list of strings that represents the names accessible on module. If present, this function overrides the standard dir() search on a module.”

Canonical Docs: “The __dir__ function should accept no arguments, and return an iterable of strings that represents the names accessible on module. If present, this function overrides the standard dir() search on a module.”

dir(): “If the object has a method named __dir__(), this method will be called and must return the list of attributes. This allows objects that implement a custom __getattr__() or __getattribute__() function to customize the way dir() reports their attributes. […] The resulting list is sorted alphabetically”

There’s a few discrepancies here. The original PEP states that __dir__() must return a list while the canonical docs changed it to any iterable. Both of them state that “this function overrides the standard dir() search on a module”, which gave me the impression that if I implement __dir__, builtins.dir will simply return the output of my implementation.

$ echo "def __dir__(): return 'new_attr', 'deprecated_attr'" > a.py
$ python -q
>>> import a
>>> a.__dir__()
('new_attr', 'deprecated_attr')
>>> dir(a)
['deprecated_attr', 'new_attr']
>>> a.__dir__() == dir(a)
False

Maybe I’m not understanding the docs all too well but the fact that obj.__dir__ and dir(obj) can give me different results caught me off guard. The only hint of this behaviour is mentioned in the docs for builtins.dir, where it states that the result will be a sorted list, presumably it does something like this (pseudocode):

def dir(obj):
    return sorted(obj.__dir__())

This ends up both type casting the original to a list and changing the order (as shown in the example, deprecated_attr is now on the front).

I’m not entirely sure if this is just my failure of understanding the docs or not, so I thought I’ll bring it up. This isn’t really mission critical to my code, so I can live with it but I’m curious to hear some opinions.

pepoluan · December 30, 2024, 10:57am

Whenever there’s a discrepancy between a PEP and the canonical docs, the canonical docs takes precedence.

This is because a PEP was basically a “Request For Change” for Python, e.g., to implement something. And it becomes a historical note as soon as it’s Finalized. Ever since a PEP is finalized, whatever being proposed might be changed as people using the facility introduced by the PEP is ‘experienced’ by many.

It’s likely since PEP 562 was finalized (7 [seven!] years ago), people asked that the dir() command enforces some things (e.g., sorted output to make things easier to read – especially as dir() is more commonly used in a REPL).

MegaIng · December 30, 2024, 11:26am

I also want to point out that PEP 562 is not the original introducing of __dir__ - it was already available in normal classes. It’s likely that noone paid attention to the wording too much - “it behaves the same as a classes __dir__” was the intended meaning, and that is what we got.

Monarch · December 30, 2024, 3:32pm

Yep, I’m aware. I included it for the sake for completeness and history.

I agree but partially. I think the current behaviour is perfectly fine when there is no __dir__ defined but when someone does define a custom __dir__, it’s likely because they want to provide something better than the default, otherwise there would be no point in overriding it.

That’s true! I guess that means class’ __dir__ has the same issues.

>>> class A:
...     def __dir__(self):
...         return "new_attr", "deprecated_attr"
...
>>> dir(A())
['deprecated_attr', 'new_attr']
>>> A().__dir__()
('new_attr', 'deprecated_attr')
>>> A().__dir__() == dir(A())
False

MegaIng · December 30, 2024, 3:43pm

There is no issue except you not reading documentation.

__dir__ allows you to overwrite the search algorithm of dir, designed for cases where you have custom __getattr__ definitions making it impossible for the default algorithm to find the proper attribute names (or for some other reason, i.e. you want to hide some attributes that aren’t hidden by default).

It does not, and in no way suggests, that you can overwrite the return type or order of attribute names returned.

Monarch · December 30, 2024, 5:11pm

I have read the documentation. I do think docs should emphasize this behaviour a bit more than they currently do. I understand the current behaviour. I find the current behaviour “wrong”. My only issue here is that I think obj.__dir__() == dir(obj) should always hold, similar to how obj.__str__() == str(obj). Maybe I didn’t make it clear enough, apologies if so.

TIGirardi · December 30, 2024, 5:18pm

There are 3 potential issues in the OP:

obj.__dir__() != dir(obj)
PEP 562 requires that module.__dir__(), if present, return a list, but the data model docs let’s it return an iterable instead
The built-in functions docs for dir(object) requires that object.__dir__(), if present, return a list, but again the data model docs allows an iterable to be returned and also state that dir(object) will accept an iterable.

Issue 1 was never guaranteed, nor can it be: dir’s result will be sorted and __dir__'s is not required to be so.

Issue 2 could be just that it wasn’t implemented yet, or implemented in spirit. Still is good to note the discrepancy.

Issue 3 is a problem: one would expect that the data model docs are authoritative over the special methods, and the built-in functions docs authoritative over the built-ins, but the actual dir function is, of course, liberal in what it accept, consistent with the data model. The built-in function docs should probably change to reflect that.

MegaIng · December 30, 2024, 5:31pm

But this just isn’t a property you should expect from magic methods. len(obj) might not be the same as obj.__len__() ^[1], a + b might not be a.__add__(b) (same for almost all other operator overloads). There is often some extra processing that is being done that can change the result or turn the result into an error message that wouldn’t appear if you directly call the function. This is even true for __str__. If the __str__ method misbehaves and returns something that isn’t a string, str(obj) raises an error whereas obj.__str__ just returns this object.

I genuinely don’t believe that Issue 2 and 3 are relevant or even close to “being a problem”. If you want, go propose a doc PR, I doubt anyone will fight it (unless they make the argument that iterable is too confusing of a term), but it’s also not something to get hung up on. I bet if you search for it, you will find many places where the python docs have slight inaccuracies like this, either because the original author didn’t think about it too hard, small changes in implementation have been made since then or the more correct option was considered too confusing for this place in the docs. (also, note that “list” is not written as “list” meaning you could reasonably defend the text as not referring to the python datatype).

or type(obj).__len__(obj) the more correct equivalence. I am going to continue to use the shorter form, but be aware that the longer form is often the only correct one. ↩︎

TIGirardi · December 30, 2024, 5:35pm

Good point, “list” both in the docs and PEP should be understood as sortable iterable. There is no issue here.

Obj.__dir__() != dir(obj)

Obj.dir() != dir(obj)