Override dunder of built-in

I know that python normally prohibits that. but I remember that there was (and maybe still is) some obscure way to do so if you do it like in the definition of lt or something like that. unfortunately I can’t find it anymore.

It was something similar to this if I recall correctly. But this doesn’t work and I don’t remember the “correct” way anymore.

    class Evil:
        def __lt__(self, other):
            int.__add__ = lambda self, other: 42
            return True
    
    Evil() < Evil()

    print(1 + 2)  # 42

You can do this only with ctypes. Everything else I would consider as a bug.

1 Like

Yes, i am looking for the/a bug that allows/allowed this in an at least somewhat recent python version (3.9 or newer).

I am 99% sure that there was one. I think i saw it either on discuss or github, but cant find it anymore.

You can technically replace the int definition in globals()[‘__builtins__‘] at runtime, but changing one of the builtins like that will cause tons of issues since it’s not just your code that uses them.

I’ll start with one piece of complete certainty: Any weird hack like that cannot affect the addition of two constants. That is done at compile time, and once your evil class has done its work, the print call is identical to just print(3).

As others have mentioned, you can redefine the name int, either for this module or elsewhere. So I suppose what you might do is:

class StudiedEvilButGotAnF:
    def __lt__(self, other):
        global int
        class int(int):
            def __add__(self, other): return 42
        return True

StudiedEvilButGotAnF() < 1

print(int("1") + 2)

Doesn’t really prove much, and it isn’t a bug, nor is it obscure. But I have no idea what quirk you might have been thinking of here. If you’re certain it was a bug, and that it existed in Python 3.9, I’d recommend browsing the changelogs for the 3.9 branch, seeing if anything catches your eye.

1 Like

I found the behaviour that i was looking for which allows the changing of builtins

class X:
    def __eq__(self, other):
        other["length"] = "24 hours"
        other["__add__"] = my_dunder


str.__dict__ == X()
int.__dict__ == X() 

day = "Monday"
x = day.length
print(x)           # 42
print((1).length)  # 42

But “unfortunately” it does not allow me to replace the dunders in a meaningfull way.

print("A".__add__)
print((1).__add__)
print("A"+"B")
print(1+2)
print("1"+2)
print(1+"2")


<bound method my_dunder of 'A'>
<bound method my_dunder of 1>
AB
3
Traceback (most recent call last):
  File "/mnt/c/Users/janer/Documents/Projects/TicTacToe/test.py", line 24, in <module>
    print("1"+2)
          ~~~^~
TypeError: can only concatenate str (not "int") to str

1 Like

Using the + operator on built-ins, even if you override the associated dunder, will still call the original bytecode operator:

import dis
class add_cast:
    def __eq__(self, other: dict[str, object]):
        other['__add__'] = lambda a,b: a + type(a)(b)
        return True

assert str.__dict__ == add_cast()
assert int.__dict__ == add_cast()

funcs = [
    "str(1).__add__(int('1'))",
    "int('1').__add__(str(1))",
    "int('1') + str(1)",
    "str(1) + int('1')",
]

for func in funcs:
    print('-'*80)
    try:
        ret = eval(func)
    except TypeError as e:
        ret = e
    print(func, ':', ret)
    dis.dis(func)
    print()
Output
--------------------------------------------------------------------------------
str(1).__add__(int('1')) : 11
  0           RESUME                   0

  1           LOAD_NAME                0 (str)
              PUSH_NULL
              LOAD_CONST               0 (1)
              CALL                     1
              LOAD_ATTR                3 (__add__ + NULL|self)
              LOAD_NAME                2 (int)
              PUSH_NULL
              LOAD_CONST               1 ('1')
              CALL                     1
              CALL                     1
              RETURN_VALUE

--------------------------------------------------------------------------------
int('1').__add__(str(1)) : 2
  0           RESUME                   0

  1           LOAD_NAME                0 (int)
              PUSH_NULL
              LOAD_CONST               0 ('1')
              CALL                     1
              LOAD_ATTR                3 (__add__ + NULL|self)
              LOAD_NAME                2 (str)
              PUSH_NULL
              LOAD_CONST               1 (1)
              CALL                     1
              CALL                     1
              RETURN_VALUE

--------------------------------------------------------------------------------
int('1') + str(1) : unsupported operand type(s) for +: 'int' and 'str'
  0           RESUME                   0

  1           LOAD_NAME                0 (int)
              PUSH_NULL
              LOAD_CONST               0 ('1')
              CALL                     1
              LOAD_NAME                1 (str)
              PUSH_NULL
              LOAD_CONST               1 (1)
              CALL                     1
              BINARY_OP                0 (+)
              RETURN_VALUE

--------------------------------------------------------------------------------
str(1) + int('1') : can only concatenate str (not "int") to str
  0           RESUME                   0

  1           LOAD_NAME                0 (str)
              PUSH_NULL
              LOAD_CONST               0 (1)
              CALL                     1
              LOAD_NAME                1 (int)
              PUSH_NULL
              LOAD_CONST               1 ('1')
              CALL                     1
              BINARY_OP                0 (+)
              RETURN_VALUE

Notice how calling __add__ directly uses the override, but using + still calls the BINARY_OP operator. This is I believe, a CPython optimization since you are not allowed to override builtin methods.

Yeah. There are actually two things here.

The first is that if i do “1+2” the bytecode already shows 3 as Chris said and even with ways where that doesnt happen, we dont call the overridden dunder but still the builtin.

What i had remembered wasnt how to override dunders of builtins but only how to circumvent the fact that you cant modify them for adding/replacing attributes.

Ah. There’s a very VERY important subtlety here. When you do something like "a" + 1, Python does not actually call "a".__add__(1) but type("a").__add__("a", 1). Setting a __dict__ on an object will allow you to set attributes on it, but won’t allow you to redefine dunders. (There’s also other subtleties in CPython relating to slots - not to be confused with __slots__ - and, just to make things more fun, there are some special cases where putting something on an instance DOES work, but they’re rare.)

But yes that is a fascinating bug.

2 Likes

Is it really a bug if literally everything you have to do to get there raises warnings and errors that what you’re trying to do is wrong?

Does it? What warnings and errors come up here?

Normal attempts to overload the builtin methods directly raise exceptions since they’re implemented as descriptors.

Attempting to overwrite the __dict__ attribute will also raise exceptions because the __dict__ attribute is a mappingproxy which prevents assignments.

Those two alone are enough to prevent someone from doing this in any reasonable situation I think. The fact that you can assign an instance dict to the type at runtime and then call the method that you put in that dict is interesting, but it would be weirder to me if that didn’t work since that’s normal behavior.

For method, python will calll tp_* to operate. For static types (e.g. int, str and some python classes defined in c), the tp_* is override by their own implemention, and python will generate the callback with corresponding name as keys to __dict__ (for int it will call PyLong_Type.tp_as_number.nb_add). For heap types (e.g. classes that use keyword class to defined), python do it on the contrary. So covering __add__ to class int cannot influence PyLong_Type.tp_as_number.nb_add

1 Like