Add ability to use `id(obj)` to get `obj`

pointers are really useful in every language that supports them, i think it is useful in python too.

some use cases i know:

  • passing objects by reference, this is helpful when we want to replace the object itself completely. i know this might be hard in CPython implementation but that’s implementation detail.
  • it adds flexibility and allows C programmers to use C like features in python. python is general purpose and pointers are a general thing that most popular languages support.
  • having references to objects explicitly, not implicitly. (instance.attribute is implicit reference)

the most similar thing to pointers is a 0 dimentional list like this:

a = [object()] # this could be `object()`
b = [a] # this could be `b = id(a)`
a_again = b[0] # this could be `objects()[b]` or something similar

python already has globals() and locals(). why not have objects() that it’s key is id(obj).

You might want to read Facts and myths about Python names and values | Ned Batchelder and then ask yourself why Python needs a way to simulate pass by reference.

5 Likes

You mentioned that in the previous thread, but I don’t see how a reversible id() makes this possible. Can you show a worked example?

Different languages are different. Python’s object, memory, and variable model are different than C or Rust. JavaScript, Java, Ruby, etc don’t have pointers because they are more similar to Python.

Here are some things you have to consider about Python’s mechanisms:

(1) Object ids are not unique. The same id can refer to different objects at different times. Try running this:

def f():
    x = [1]
    return id(x)

def g():
    y = [2, 3, 4, 5]
    return id(y)

flistid = f()
glistid = g()
print(f"{flistid = }, {glistid = }")

I get: flistid = 4331774144, glistid = 4331774144. What should objects()[4331774144] return?

(2) In order to get an object back from an integer, you need an explicit mapping. You can’t simply examine memory at that address, you don’t know whether it’s a live object or not. As shown above, you don’t know if it it’s the same object as when you captured the id. How large will this mapping be?

(3) How would you use this int→object mapping? Wherever you are keeping the int, why not keep the object? The ints are objects in Python anyway, so you can keep the original object.

(4) If you do have a need for an int→object mapping, you can keep one yourself for the small set of objects you need in the mapping. WeakRefs might be useful.

I’m really pleased to see you excited enough about Python to suggest changes. I think as you dive more into the deeper logic of its behavior, you will discover much to marvel at.

BTW: when continuing this discussion, it’s really helpful to copy the portion of text you are replying to. Discord has Quote and Copy Quote features when you select text. It makes it easier to follow the discussion. It takes a little more time, but it’s worth it.

(sorry, edited a few times to get the code markup right!)

12 Likes

Practicality aside, you can do this already through ctypes (though I don’t recommend it):

import ctypes

def dereference(address: int) -> object:
    # Let's hope the address is valid!
    return ctypes.cast(address, ctypes.py_object).value

x = "hello"
print(dereference(id(x)))  # hello
13 Likes

If I understand you correctly, this is not possible:

a = 10
# do something special with id(a)
assert a != 10, "object not replaced"
1 Like

thanks. this is exactly what i meant, but with memory safety.

Thank you, this is helpful.

but with (1): the object goes out of scope. so we can’t use the id. and if python did have dereference feature, it would raise an error, unless the object at that memory location gets replaced. at least this is how i want the feature to be implemented.

Python does not have real pointers like C, and id(obj) only gives the memory address as a number. You cannot directly get the object back from id in normal Python. Your idea of an objects() function that maps ids to objects is interesting, but it would need extra memory and careful handling to avoid keeping objects alive unintentionally. For most cases, Python’s references and lists already let you work with objects safely.

(Please include the text you are replying to).

I don’t understand how this would be useful. We take an id of an object, then some time later we use that id, but oops, there’s a different object there now, and we have no way to know. This doesn’t sound good, or a way to build reliable software.

You didn’t answer this question:

Wherever you are keeping the int, why not keep the object?

I don’t understand how a reversible id() gives you any more expressive power. You can keep the original object.

You haven’t yet demonstrated anything new that objects() would let you do.

8 Likes

What uses do you have in mind for this?

because we can’t change or swap objects, we only can mutate them.

i don’t know, i did mention use cases in the post. do you want me to write code for clarity?

data structures. that’s in my mind, at least for now.

If your goal is to have references to objects that can be swapped, why not just make a proxy object like this:

@dataclass
class Swappable:
    value: object

You can then “swap” the object in some_var by assigning to some_var.value. Of course, this only works cooperatively, the code that uses swappable objects needs to explicitly opt in to them being swapped. But I’d say that that’s a good thing. Most code assumes that objects aren’t arbitrarily being swapped around. With the many subtle and hard to debug problems that multithreaded code has, we can see that even just mutations that weren’t expected often lead to bugs. I’m not sure if any non-trivial pice of code I’ve written could actually handle arbitrary object replacements.

3 Likes

If you don’t know, then why suggest this. And yes, I think everyone here would like to see some code that would work of we had this.

To be perfectly honest, I want you to write code so that you can discover whether your idea is going to be useful or not. I don’t think it will be useful, and I don’t think it will even accomplish some of the things you think it will.

I don’t understand what you mean. Definitely specific examples with code will help clarify everyone’s understanding, including yours.

5 Likes

I’ve found that using a reference to the parent and the key in the parent works quite well:

>>> import jsonyx as json
>>> obj = [1, 2, 3, 4, 5, 6]
>>> root = [obj]
>>> node = root, 0  # pointer to obj
>>> for target, key in json.select_nodes(node, "$[@ > 3]"):
...     target[key] = None
...
>>> root[0]
[1, 2, 3, None, None, None]
>>> for target, key in json.select_nodes(node, "$"):
...     target[key] = None
...
>>> root[0] is None
True

Or if you want a pointer to the object itself:

>>> obj = [1, 2, 3, 4, 5, 6]
>>> target, key = locals(), "obj"  # pointer to obj
>>> target[key] = None
>>> obj is None
True

NOTE: Whether or not updates to this dictionary will affect name lookups in
the local scope and vice-versa is implementation dependent and not
covered by any backwards compatibility guarantees.

list exist.


Do I want to mutate objects, or do I want mutate what do names point to? When you pass by ref in C, compiler needs to put object on the stack, for it to even have an address. What’s wrong with Python requiring explicit memory placement?

1 Like

Now that I think about it, there is also on more thing.

Aside from Assembly, no non-esoteric language allows this. There is a misconception among C devs, that pointers are just numbers. While on some platforms that’s true (x86, ARM, or AVR), that’s not true in C. Even if target is known, C doesn’t allow conversion from integers to pointers.

Pointers are offsets from origin of their abstract addressing space. malloc creates new space, and free invalidates the space. Python already supports that:

space, ptr = [None] * x, 0 # void **ptr = malloc(x * sizeof(void*));
space[ptr] = b # *ptr = b;
ptr += 1 # ptr += 1;
del space # free(ptr);

What Python doesn’t support is getting a pointer of any “lvalue”. locals() creates a copy for a reason. And globals() should be subscript via str, also for a reason.

They are also a source of errors and complexity. ISTM that not having pointer dereferencing is a virtue.

2 Likes

We don’t need to debate whether it’s a virtue or a vice. Dereferencing pointers just won’t work in Python, it’s incompatible with its managed memory.

2 Likes