PEP for __assignment_to_self__, __meta__ dunders, meta() magic-method, pureread(), purewrite(), purefunc() builtins
Impetus and Introduction
I tried writing a ProxyType class, and an implementation using it which had the goal of fully transparently having an on-disk and in memory representation of a data structure as .yaml. I found a number of stumbling blocks which I believe can be used to explore ways to improve the language. I’m first going to establish the technical basis for these. Afterwards, I’ll attempt to sufficiently prove their merit by covering the value which they could bring. Lastly, I will propose concepts for how they could be implemented.
Technical Basis
The __assignment_to_self__ Dunder and the Self-Assignment Problem
One of the first issues I first ran into is the root-assignment problem, which quickly scales to the self-assignment problem. If making a ProxyType, it is impossible to assign values to the proxied value inside the ProxyType itself. Instead, the ProxyType is replaced. This is readily expected and normal behavior when considering plain types. However, if we want a true ProxyType then we need a way to allow delegation of the assignment to the ProxyType itself to allow for proxying the assignment.
The __meta__ dunder, meta() builtin, and the Frequent Need of Higher-Level Programmatic Communication
My current work with Django and thinking on the ProxyType has made me realize that there is frequently a critical, but unfortunately abstract, use-case for types to describe themselves or augment the way the object behaves. For Django’s need, I’ll defer to the Model types use of _meta. For a ProxyType, there needs to be a standard and regular way to augment at runtime how to proxy. I am reserving the justification for the next session. This section only establishes that there is a strong pattern which ought to be more formalized.
Enabling optimizations by detecting function purity with pureread(), purewrite(), and purefunc() builtins
There has long been want of a way to detect degrees of function purity. Purity is hard to introduce into the typing system as in many ways it is a characteristic or meta-attribute which doesn’t map cleanly to the typing system. Rather, by establishing relation about external state to a function’s execution, we can characterize functions. The basic way to do this is to establish whether the function does or does not perform a read or write from external state. pureread() reports True iff the function only reads data sent in through it’s parameters. purewrite() reports True iff state external to the function is not possible to alter external state (relevant to parameters like lists). purefunc() operates as return pureread(f) && purewrite(f) to reduce boilerplate.
Value and Merits
This section attempts to establish sufficient value for each of these changes through high-level use-case examples. It is not meant to be comprehensive and is surely going to be the most critical section for debate. Rather than be thorough to an extreme degree, I intent to expand on this through follow-up discussion. I believe doing so is more efficient and robust.
Use-case example for __assignment_to_self__
A more extreme use-case for ProxyType is for a state object which is file-backed and synchronized with many different processes across a compute grid utilizing Raft v2. The power of a system which can recover from crashes from file and keep in sync with a massive compute grid by sharing an implementation of ProxyType which behaves exactly like a standard type (i.e. dict) allows ease of development of these systems in a way currently syntactically impossible. This is ostensibly an argument for the ProxyType, but without being able to handle such simple operations as v = "a" through a dunder such as __assignment_to_self__, then this category of functionality is impossible. I believe this to be one of the most contentious proposed changes because it is the most likely to be seen as introducing unexpected behavior.
Use-case example for __meta__ dunder and meta() builtin
The Django Model type is an easy example for handling metas in a standard way. Similarly with the aforementioned grid compute example utilizing a ProxyType which would need to handle updates in network configuration. There are doubtless other major examples. There is an already established need and similar usage. I believe it is time to consolidate and standardize.
Use-case example for pureread(), purewrite(), and purefunc() builtins
functools.cache would immediately benefit for correctness by being able to test for function purity with purefunc(). The need that I’ve discovered for pureread() and purewrite() stem from the experience I gained with the ProxyType. The ProxyType needs specific detection for locking needs. If a proxied function is a pure function, it does not need locking. If a proxied function needs to read external state, it needs a read lock since there can safely be multiple readers. If a proxied function writes to external state, readers must be blocked and it must obtain a write lock. I believe this will be the strongest changes to propose in terms of acceptability.
Conceptual Implementation
This section proposes a way to go about implementing these proposals conceptually. These proposals are not sacred. They are merely a way. They DO NOT relate to merits.
An implementation for __assignment_to_self__.
The = operator would have to change from an intrinsic to a magic-method. This would be the same as += redirecting to __iadd__(). The implementation would likely be somewhat invasive to primitive types, but is simple.
An implementation for __meta__ dunder and meta() builtin
In order to allow for operating on objects and types with respect to __mro__, an implementation utilizing .meta(*args, **kwargs) on a variable is not viable. This is similar to how we think about and use getattr() and setattr(). To follow with those methods, I propose type.meta(subject, *args, **kwargs) for using alternate implementations (presumably implementations in __mro__) and meta(subject, *args, **kwargs). The former will call the specified type’s __meta__ while the latter uses the subject’s __meta__.
An implementation for pureread(), purewrite(), and purefunc()
While expected to be the least controversial, this might also be the largest actual change. This will require expanding both the inspect module and the codeobject section of the datamodel. During the parsing of functions, parameters need to be marked as copied. Then, statement by statement, each is marked for reads and writes and whether the read or write is on a variable which is tagged as copied or not. Nested calls are recursively assessed on similar criteria. On finishing parsing, the entire function itself is tagged as being pure for reads and pure for writes. To address functions which conditionally are pure for reads and pure for writes, tracing for possible values in parameters in recursively assessed functions will allow for more precise reporting on purity. This set of changes really works against types which are pass by reference. To address this aspect, because otherwise passing list’s or dict’s would unexpectedly alter purity, a function qualifier of either pure or impure could be introduced to explicitly allow old behavior with being able to alter state through parameters which are passed by reference with the default changing to COW or visa versa.
My preference from experience of least surprising behavior is to change all pass by reference to be COW and introduce the impure keyword to restore the current exact parameter behavior. It is important to note that, even with full COW, objects which have functions on them which fail purity tests will still cause the function they’re used on to fail purity tests if those functions are called.