I propose specifying a function parameter as being pass-by-reference, using the notation “&name” instead of “name”.
The actual argument can be any valid assignment target: a variable, an attribute, a subscript, or even a “&name” in another function.
Any references to “name” in the function (get, assign, delete) will have the same effect as the same references to the actual argument in the caller.
Attributes and subscripts are more involved than simple variables, and so I consider this an optional part of my proposal. I would be pleased to be able to modify a variable by passing it by reference to a function.
Here’s an example where I would find this useful, in an interactive session:
>>> from click import edit
>>> def update(s: str, maybe: bool = True) -> str:
... return edit(s) if maybe else s
...
>>> s = "hello"
>>> s = update(s)
>>> s = update(s, maybe=False) # returns and assigns s unchanged
There is extra coding work here. The update() function has to be sure and return a new value in all cases, even if s isn’t changed. And the result of update(s) has to be assigned to s. These are both places where a coding error would be easy to make.
With my proposal, it would look like this:
>>> from click import edit
>>> def update(&s: str, maybe: bool = True) -> None:
... if maybe: s = edit(s)
...
>>> s = "hello"
>>> update(s)
>>> update(s, maybe=False) # No assignment to s
Here the update() function doesn’t have to do anything if it decides no to modify s. And the call to update() doesn’t have to explicitly change s.
Here’s another example which I and many others would find useful.
def safe_delete(&obj: Any) -> None:
try: del obj
except: pass
x = 1
del x
del x # NameError
x = 1
safe_delete(x)
safe_delete(x) # OK
Examples using attribute or subscript targets:
@dataclasses.dataclass
class A:
str s = 'hello'
a = A('foo')
update(a.s)
print(a.s) # new string provided by update()
d = dict(s='foo')
update(d['s'])
print(d['s']) # new string provided by update()
Implementation
In update.__code__:
- s is stored in the frame as a (Cell *) rather than a (PyObject *), as though it were a free variable or a cell variable. There is no MAKE_CELL bytecode
- References to s would use LOAD_DEREF, STORE_DEREF, and DELETE_DEREF bytecodes. s does not appear in update.__closure__.
In the caller:
- The
def update
statement is implemented in the usual way. The parameter s is flagged in the symbol table as CELL | DEF_PARAM | DEF_BOUND, with a new flag DEF_BYREF added. This will prevent emitting a MAKE_CELL bytecode and tell the caller to pass s as a Cell object. - The variable s is flagged as CELL, just as though it were captured as a free variable in some enclosed scope.
- The call to update(s) uses LOAD_CLOSURE to put the Cell for s on the stack as an argument. New bytecodes will be required to put a Cell subtype on the stack for an attribute or subscript target.
Attributes and Subscripts
These entail defining new types in cpython similar to the cell class. They should behave as cell objects, by subtyping the PyCell_Type type. The ob_ref member is reused to point to the object whose attribute or subscript is to be accessed.
PyCell_Check() will will check first for PyCell_Type, then check if the type’s base type is PyCell_Type. This is as fast as currently implemented when the object is a PyCell.
PyCell_Get() and PyCell_Set() will need to do different things for these new types. They will check for PyCell_Type exactly, so they are as fast as currently implemented when the object is a PyCell. When this check fails, it will execute appropriate code for the actual subtype.
The get/set functions for cell_contents will be different for each type, found via the type object.
The PyAttrCell_Type type (or whatever you want to call it), would hold a reference to a name (string). It will get/set/delete the named attribute of the ob_ref, with the usual exceptions.
The PySubscrCell_Type type will hold a reference to a subscript. It will get/set/delete obj_ref[subscript], with the usual exceptions.