Consideration of pureread(), purewrite(), purefunc() builtins
This is one of the broken out pieces from a previous discussion
Impetus and Introduction
I tried writing a ProxyType class, and an implementation using it which had the goal of fully transparently having an on-disk and in memory representation of a data structure as .yaml. I found a number of stumbling blocks which I believe can be used to explore ways to improve the language. I’m first going to establish the technical basis for these. Afterwards, I’ll attempt to sufficiently prove their merit by covering the value which they could bring. Lastly, I will propose concepts for how they could be implemented.
Technical Basis
Enabling optimizations by detecting function purity with pureread(), purewrite(), and purefunc() builtins
There has long been want of a way to detect degrees of function purity. Purity is hard to introduce into the typing system as in many ways it is a characteristic or meta-attribute which doesn’t map cleanly to the typing system. Rather, by establishing relation about external state to a function’s execution, we can characterize functions. The basic way to do this is to establish whether the function does or does not perform a read or write from external state. pureread() reports True iff the function only reads data sent in through it’s parameters. purewrite() reports True iff state external to the function is not possible to alter external state (relevant to parameters like lists). purefunc() operates as return pureread(f) && purewrite(f) to reduce boilerplate.
Value and Merits
This section attempts to establish sufficient value for each of these changes through high-level use-case examples. It is not meant to be comprehensive and is surely going to be the most critical section for debate. Rather than be thorough to an extreme degree, I intent to expand on this through follow-up discussion. I believe doing so is more efficient and robust.
Use-case example for pureread(), purewrite(), and purefunc() builtins
functools.cache would immediately benefit for correctness by being able to test for function purity with purefunc(). The need that I’ve discovered for pureread() and purewrite() stem from the experience I gained with the ProxyType. The ProxyType needs specific detection for locking needs. If a proxied function is a pure function, it does not need locking. If a proxied function needs to read external state, it needs a read lock since there can safely be multiple readers. If a proxied function writes to external state, readers must be blocked and it must obtain a write lock. I believe this will be the strongest changes to propose in terms of acceptability.
Conceptual Implementation
This section proposes a way to go about implementing these proposals conceptually. These proposals are not sacred. They are merely a way.
An implementation for pureread(), purewrite(), and purefunc()
While expected to be the least controversial, this might also be the largest actual change. This will require expanding both the inspect module and the codeobject section of the datamodel. During the parsing of functions, parameters need to be marked as copied. Then, statement by statement, each is marked for reads and writes and whether the read or write is on a variable which is tagged as copied or not. Nested calls are recursively assessed on similar criteria. On finishing parsing, the entire function itself is tagged as being pure for reads and pure for writes. To address functions which conditionally are pure for reads and pure for writes, tracing for possible values in parameters in recursively assessed functions will allow for more precise reporting on purity. This set of changes really works against types which are pass by reference. To address this aspect, because otherwise passing list’s or dict’s would unexpectedly alter purity, a function qualifier of either pure or impure could be introduced to explicitly allow old behavior with being able to alter state through parameters which are passed by reference with the default changing to COW or visa versa.
My preference from experience of least surprising behavior is to change all pass by reference to be COW and introduce the impure keyword to restore the current exact parameter behavior. It is important to note that, even with full COW, objects which have functions on them which fail purity tests will still cause the function they’re used on to fail purity tests if those functions are called.