Cancel the support from `dir` to the attribute name that is not string

class MyClass:
    def __le__(self, other):
        return False
    __lt__=__le__
    __gt__=__le__
    __ge__=__le__
a=MyClass()
a.__dict__[a]=a
print(dir(a))

In this example, the wrong attribute a will be show in the list. However, the attribute is unable to read with getattr and it may cause some problems.

I think that the function dir should check that whether the attribute is the instance of str and only show the attribute that the name is string.

For me, <__main__.MyClass object at 0x000001C9ADD39400> is included in the list. How I am not sure, but this looks as right as possible. Since the entry is there, I think something should be printed. There could be code that depends on the current behavior.

1 Like

I think that’s what @Locked-chess-official was suggesting is a bug, and I think I agree.

While it’s true that that object ends up in __dict__, I don’t think it should be part of dir’s output since dir is supposed to tell us what attributes this object has. We can’t access that thing as an attribute, so I think itfeels like a bug to have it included in dir’s output.

>>> getattr(a, a)
Traceback (most recent call last):
  File "<python-input-0>", line 1, in <module>
    getattr(a, a)
    ~~~~~~~^^^^^^
TypeError: attribute name must be string, not 'MyClass'

So I feel like it would be more correct for dir only to include strings in its output, even if there are non-string keys in __dict__.

What is your reason for putting a non-string key in a __dict__? There is no guarantees that it will work in Python. It only works in the current implementation of CPython because it would be costly to prevent this.

So, I think this is a case of “garbage in – garbage out”. You used a feature that was not promised to work. Don’t expect it to give you the result you want.

11 Likes

Not a big deal for the reasons you mention, but wanted to include my additional $.02 :slight_smile:

I agree that this is a strange thing to do. However, given that it’s possible to end up with non-string keys in __dict__, and given the documentation for dir, I feel like dir ought to filter out non-string keys from __dict__.

dir(...)
    dir([object]) -> list of strings
    
    If called without an argument, return the names in the current scope.
    Else, return an alphabetized list of names comprising (some of) the attributes
    of the given object, and of attributes reachable from it.
    If the object supplies a method named __dir__, it will be used; otherwise
    the default dir() logic is used and returns:
      for a module object: the module's attributes.
      for a class object:  its attributes, and recursively the attributes
        of its bases.
      for any other object: its attributes, its class's attributes, and
        recursively the attributes of its class's base classes.

I’ll readily admit that I am not super familiar with CPython’s internals, but it seems like filtering non-string keys from dir would not be terribly costly (compared to preventing the addition of non-string keys to __dict__ in the first place). Would the basic implementation not be something like the following?

There are performance implications here, of course (making a new list) so I’m almost certain there’s a better way, but in principle this feels like a relatively minor change to me.

Scenario 1. If one puts non-string attributes into __dict__, then he has all the necessary knowledge of his situation to filter if he only wants string keys.

Scenario 2. One needs to get both string and non-string attribute keys.

Currently both scenario 1 and scenario 2 are possible.
Restricting this to string-keys-only removes the possibility of scenario 2.


All in all, I don’t see what is wrong with this:

  1. __dict__ holds attributes
  2. One can add non-string key to __dict__, which is theoretically also an attribute
  3. dir returns a list of attributes as stated clearly in the docs

This is because that in fact, the non-string attribute is illegal and not reachable .The documents of dir say clearly that what returned is LIST OF STRINGS.

Why isn’t your request to

  • allow getattr to accept non-string instances if they are present in __dict__
  • disallow adding non-string keys to __dict__?

Why is dir the function your are focusing on? Why are you not fixing whatever code is adding non-string keys to __dict__?

As said above, this is garbage-in garbage-out. If you overwrite __dir__ on an object you can also get arbitrary output types.

1 Like

I really can’t see this as any more meaningful than running:

import ctypes
__builtins__.print = lambda *args, **kwargs: ctypes.c_char_p(-1).value

then complaining that the docs for print() are lying about print() printing things.

Where does it say that? I see “list of names”, and one might assume names have to be strings, but if you put something in there named <repr of an object> that’s what you’ll get.

I don’t know about the formal definition of this, but if one can add non-string-key-attributes to __dict__, then one can also get them from there. These will not be accessible via standard machinery, but if they are there, then they are there and it is undeniable.

Missed the return type in docs.

dir(...)
    dir([object]) -> list of strings

For rigid correctness, this in theory could be changed to list of attributes.

So questions in sequence are:

  1. Should the behaviour be changed?
  2. If not, then should return type in docs be amended?

I am -1 on (1).
And neutral on (2).

In the docs.python.org, it says “list of name“, but help(dir) here:

>>> help(dir)
Help on built-in function dir in module builtins:

dir(...)
    dir([object]) -> list of strings

    If called without an argument, return the names in the current scope.
    Else, return an alphabetized list of names comprising (some of) the attributes
    of the given object, and of attributes reachable from it.
    If the object supplies a method named __dir__, it will be used; otherwise
    the default dir() logic is used and returns:
      for a module object: the module's attributes.
      for a class object:  its attributes, and recursively the attributes
        of its bases.
      for any other object: its attributes, its class's attributes, and
        recursively the attributes of its class's base classes.

So which is right?

list of strings is correct. As others have stated, you’re doing something unsupported with __dict__, so it breaks other assumptions. Don’t do things with datamodel methods and attributes that contradicts their purpose; All dunders are reserved by python for it’s use, and using them in ways that are not supported for your use by Python puts the outcome squarely on you.

No reason to make everyone pay a performance penalty here, just don’t do what you’re doing.

So fix the docs to say “list of names” (matching what the descriptive text says). Remember that the docstring is not formal type information here.