Static typing: Specify if function is pure

matthewyu0311 · July 17, 2025, 1:12am

We’re talking about at least two levels of purity:

purity as in being pure enough for typing purposes, allowing the use of caches and other globals responsibly. The consenting adult principle apply: so long as the user doesn’t poke around its internals or monkeypatches it behaves pure.
Purity in its strict provable sense that enables advanced subinterpreter/multithreading/JIT optimizations, in which case if you poke around things that the interpreter considers to be pure, it breaks in a hard way.

This thread is about static typing and we should only be concerned about the former. We have plenty examples of the former but very few of the latter. Python is a dynamic language and it currently doesn’t provide many guarantees necessary for the latter (is __add__ implemented as no-globals pure? Does it use a cache? Are the descriptors really globals-free?)

For the former a decorator that marks a subtype of Callable (perhaps PureCallable?) would likely suffice. Optimizers should rightly pay no attention to the decorator or any other user-added type markers, especially if the penalty of getting it wrong will be a segfault / hard crash.

mikeshardmind · July 17, 2025, 1:21am

The first notion of purity you described there isn’t actually useful for anything. a cache is inherently impure. Typing is only useful when it provides static analysis that allows ensuring specific things. a note in the docstring is plenty if you only want a consenting adults approach.

I’d rather we not add this to typing if it can’t be expressed accurately, it’s going to be a magnet for problems, especially if we’re using a definition that no other language or theory would use here.

matthewyu0311 · July 17, 2025, 7:10am

I see. Then I don’t think the overall idea is too helpful. What function uses caching and what doesn’t is an implementation decision (at least in the CPython world). If we want a purity marker to enable optimizers to work, it should be internal to the implementation, rather than being public.

It’s not useful for user code either. Following the nogil discussions there’s a conscious guardrail of “don’t trust user code to not crash the interpreter” even if we already have the “consenting adults” philosophy in the community. That guardrail applies here and we need a way for mislabeled impure user code to at least fail gracefully at runtime if we want to base optimizations on purity. What starts out to be a static typing problem quickly becomes a runtime problem. And I’m not sure if runtime analysis of purity is even feasible. Python is one such language in which even the simplest dot attribute lookup can have possible side effects.

sirosen · July 17, 2025, 9:29am

I think it’s worth remembering that the OP was interested in static analysis, rather than optimization.
Although “purity” was used as a term, it was imprecise and didn’t match their intent.

I don’t want to shut down discussion of optimization, but I think some of the original questions aren’t addressed.

Preventing Mutation of Inputs

OP showed a case of wanting to forbid a function or any function it calls from appending to a list.

Typing already provides protocols which allow for a limited version of this analysis.
If you use Sequence[int] rather than list[int], you declare that you will not call mutating methods.

The effect of using a restrictive protocol is viral within a codebase. Nothing in a chain of calls can treat a Sequence as a list so long as you avoid using ignores or casts to violate the basic contracts your types declare. And this does not introduce function coloring problems.

The built-in suite of protocols covers many container types, and you can write your own protocols for bespoke cases..
Protocols are just naming conventions, but that’s typically all we really need.

Preventing Side Effects

Trying to prevent side effects requires a definition of “side effects we care about”. Examples of typically non-semantic side effects are too numerous to count, but include

importing a module for the first time writes sys.modules and may create pyc files
calling a function repeatedly may change its bytecode as the interpreter adapts
memory utilization of the current process changes if ephemeral objects are created and disposed, as does GC info
a function with a cache gets faster (!)
time passes

Consider whether or not you care about the side effects here:

def add(x, y):
    if __debug__:
        log.debug("adding")
    return x + y

There’s no question that emitting logs is a side effect. But for many applications it is not useful to consider it one.

This goes to the point raised earlier about cache_info(). Yes, it exists and violates any reasonable definition of function purity. But it easily falls into the category of “side effects which I typically don’t care about”. That might not be a useful definition for tool driven analysis, but it’s a very useful one for reading code. I’m not sure what kinds of formalisms exist for describing such things?

For the sort of work I do, not being able to log would be problematic. So being able to support the notion that “logging is exempt from side effect analysis” is probably necessary for me.

Side Effects and Observability

Suppose I write to a module scoped global variable.
Simple example:

# mymod.py
_CACHE = {}

# uses _CACHE
def myfunc(...): ...

Whether or not there are meaningful side effects depends a bit on where you try to observe from. Within mymod.py, there’s a relatively obvious side effect. But from outside of it, if your code’s contract forbids direct use of _CACHE, there isn’t meaningfully any side effect.

On the flip side, code which itself doesn’t have any side effects can observe impact from side effects elsewhere in the program. e.g.,

_mylist = [1]

def get_list():
    return _mylist

def the_sum():
    return sum(_mylist)

Because the same list is always returned, and lists are mutable, the_sum can be changed.

The code above doesn’t have side effects, but its impurity is highly impactful.

Also, these examples use modules but the same reasoning holds naturally for other containers. A class or an object are simple to analogize.

In summary, purity may be only an element of the original post. OP’s needs/desires are legitimate, but are centered around contracts (Protocols) and mutability.

matthewyu0311 · July 17, 2025, 10:58am

Working with sequences is an example of how purity interacts with static typing in possibly difficult ways.
Consider the following example:

from collections.abc import Iterable, Sequence
# pure
def count_above_seq(x: Sequence[int], threshold: int) -> int:
    count = 0
    for item in x:
        if item > threshold:
            count += 1
    return count

# possibly impure
def count_above_iter(x: Iterable[int], threshold: int) -> int:
    count = 0
    for item in x:
        if item > threshold:
            count += 1
    return count


seq = (1,2,3,4,5,6,7,8,9,10)

assert count_above_seq(seq, 5) == 5
assert count_above_seq(seq, 5) == 5 # Always returns 5


it = iter(seq)
assert count_above_iter(it, 5) == 5 # this consumes the iterator
assert count_above_iter(it, 5) == 5 # BOOM, this time it is 0

The implementations of the two functions looks the same. Yet one is pure (the implementation of Sequence __iter__ is well-behaved according to object.__iter__ spec, returning a new iterator each time it is called) and another one is impure (the __iter__ of iterator objects returns themselves), depending on how one types it.

Sequence[int] is a subtype of Iterable[int]
Not all Sequences are immutable (and they don’t have to be in order for the function to be pure, as long as their __iter__ is well-behaved)
Iterables are allowed to be mutable, and iterators must be mutable (they are state machines) for them to work.

The complexity goes on much further. In terms of typing mutability typically we care about the interesting data structures (lists, tuples, sets, etc.) not the “disposable” objects like iterators. But in terms of a function’s purity these incidental objects (iterators, descriptors, etc.) also need to be considered.

And from the CPU too, with all those modern caching / out-of-order magic.

The confusion in this thread is between purity as a function of type, versus purity as a function of implementation. Mutability concerns are about the former, optimization concerns are about the latter. The dynamic nature of Python means one can’t generally avoid accessing globals at all, but not all global accesses are as harmful. The job of static typing is not to constrain what implementations can do, but to provide a contract that adults can consent to responsibly.

What we’ll end up with this discussion is a typing sense of “purity” that is mostly enough to be typed to indicate some useful property, but one which doesn’t conform to any strict mathematical definition and thus can’t be pinned down beyond our immediate needs. As with many things typing, what seems good enough now might not be good enough for later or everybody, just like the float / complex / int special case.

Overall I’m opposed to the idea of specifying purity which is inherently difficult to pin down in reality for the false impression that they would give, in the current state of affairs.

matthewyu0311 · July 17, 2025, 11:59am

With all the debates over caches, ironically right now the only reference in the official docs to pure functions is in functools.lru_cache which states:

In general, the LRU cache should only be used when you want to reuse previously computed values. Accordingly, it doesn’t make sense to cache functions with side-effects, functions that need to create distinct mutable objects on each call (such as generators and async functions), or impure functions such as time() or random().

A sense of “purity” that reads as good enough so that caching won’t defeat the function’s purpose, however one defines it. The doc then gives an example of caching PEP content from the web This existing reference to “purity” in the doc has nothing to do with OP’s intent of securing against reused mutable objects, nor is this sense of “purity” rigorous enough for JIT optimizations or data-sharing, and is only good enough for local performance gains. It is a mere inconvenience if the PEP content changes over time making the cached web content stale, but a major flaw if fib(900) ever returns a different result, or if used for low-level optimizations within the interpreter. For this we need a working sense of purity that is actually implementable for Python, across all future implementations. What may be pure in the current implementation of CPython on x64 might not be in a future implementation by a different vendor on a different architecture. That “purity” is currently mentioned only so loosely, instead of it being ever defined as part of the language spec, might have a reason behind it.

oscarbenjamin · July 17, 2025, 4:45pm

The OP example of not wanting a list to be mutated is reasonable and is to do with mutating a function parameter. That is something that in principle a type checker could handle. More generally if you do not mutate the list than it should be possible to treat a list[T] parameter as being covariant in T which would be useful. If the methods on list were annotated as being pure vs impure then a type checker could disallow calling the impure methods for a list that is a parameter.

The point from @sirosen about using Sequence instead of list is a good one but is not always applicable. If you look e.g. here in the sympy code base you can find hundreds of pure functions whose parameters and return types are list[E] but where the lists are treated as immutable:

github.com/sympy/sympy

sympy/polys/densearith.py

a9a061376


      
          def dup_add(f, g, K):
              """
              Add dense polynomials in ``K[x]``.
          
              Examples
              ========
          
              >>> from sympy.polys import ring, ZZ
              >>> R, x = ring("x", ZZ)
          
              >>> R.dup_add(x**2 - 1, x - 2)
              x**2 + x - 3
          
              """
              if not f:
                  return g
              if not g:
                  return f
          
              df = dup_degree(f)

This file has been truncated. show original

That code does not have type annotations yet but you can see the start of how it would look in this PR.

The type signatures are like:

def dup_add(f: list[E], g: list[E], K: Domain[E]) -> list[E]:
    ...

One awkwardness that I have found with this is that in some places there is a need to narrow or widen the type parameter from say E to Efield or Epoly etc (the Domain object carries this information at runtime). Any place the parameter needs to change is awkward because the type checker considers list[E] to be invariant in E. This means that you cannot e.g. pass a list[Efield] to a function that expects list[E] even if Efield is a subtype of E.

Now you could change the annotations from list[E] to Sequence[E] but we don’t actually want to allow other sequence types here: passing a tuple would be an error. There is also the possibility that the type E might be a Sequence itself which could lead to confusion somewhere. It has been decided that the best type for this at runtime is list and there is no reason to allow any flexibility about that in this internal code. The generic type E is bounded by a structural type but list here is always just supposed to be the exact nominal type list (not even a subclass). Also with Sequence[T] some operations don’t work:

def f(x: Sequence[int], y: Sequence[int]):
    return x + y # error (e.g. can't add tuple and list)

I want to say that the type is list but it would be nice if a type checker could understand that the lists are not allowed to be mutated unless they are newly created local variables and that list[E] should be treated as being covariant in E because we are not going to mutate the lists. If there were some way to mark these functions as “pure” (or another name) that would make this work better then I would use it.

Perhaps what is really needed here though is just something like ReadOnly[list[E]] rather than pure functions.

mikeshardmind · July 17, 2025, 5:39pm

I’ll reiterate what I said significantly earlier, as we seem to have landed back at where I expected with people having different effects they would care about.

A pure function in this framework has no effects. Trivially, in python, most functions are at least capable of the effect of raising an exception (SystemExit, KeyboardInterrupt…)

Caching decorators may care about the effect of non-determinism (which yes, is an effect modelable in a type system)

People care about different kinds of effects for different reasons, and different effects have different consequences as to their potential for “reasonable composition of functions”.

Unfortunately,

As to the “is sequence Enough?” tangent, the answer is definitively no, because of a combination of other unsoundness in the type system. Seeing a function takes a Sequence is not enough to inform you it does not mutate MutableSequences.

This is a combination of problems with ABCs, Protocols, and subtyping allowing incompatible variance, and typecheckers being lossy in type information in some places. This means it’s possible for a function to do this:

def sorter(seq: Sequence[T]) -> Sequence[T]:
    if isinstance(seq, list):
        seq.sort()
        return seq
    if isinstance(seq, MutableSequence):
        return sort_nonlist_in_place(seq)
    else:
        return tuple(sorted(seq))

And no type checker have any issue with it

Liz · July 17, 2025, 10:05pm

Unless ReadOnly[T] is incompatible with any type that isn’t marked readonly, the same problems that exist for Sequence protocols vs lists happens here. The same objections people have to deep immutability apply here, but without even actually getting the runtime enforcement, and with a probably broken static enforcement too.

If we use the sort function above:

def sorter(seq: Sequence[T]) -> Sequence[T]:
    if isinstance(seq, list):
        seq.sort()
        return seq
    if isinstance(seq, MutableSequence):
        return sort_nonlist_in_place(seq)
    else:
        return tuple(sorted(seq))

def x(i_promise_no_mut: Readonly[list[T]]):
    sorter(i_promise_no_mut)

We can look at a more pathological case:

def pathological(o: object):
    if isinstance(o, list):
        o.sort()

def x(i_promise_no_mut: Readonly[list[T]]):
    pathological(i_promise_no_mut)

and see that this goes as far as either breaking the fact that everything is assignable to object, or will never work.

sirosen · July 18, 2025, 3:20am

These examples are going to give me nightmares for weeks! Thanks, I think.

The earlier point about “sometimes you actually want to use list” is valid as well, and familiar. But getting back to list from Sequence and object seems “obvious” now that it’s been pointed out. And the implications are relatively deep – there’s no way to pass around a mutable object and protect yourself with type annotations in the current system.

The protocols like Sequence and Mapping/MutableMapping still do a great job in the 90% case, however, so I’d still encourage that solution for folks who want to have some basic lightweight protection against accidental mutations.

mdrissi · July 18, 2025, 11:42am

Hmm other languages (mostly thinking of c++) do have concepts like const/read only so I’d use them as inspiration. The main ideas are,

A type needs to label its methods as const or not to identify which ones are safe to be used without mutation. For list I’d expect getitem to be labeled const but setitem to not be marked const.
Const[T] is not compatible with T as you can do less things to it. If A is compatible with B then Const[A] is compatible with Const[B]. So Const[list] is not compatible with list. Const[list] is also not compatible with object but it is compatible with Const[object].
Now how do we deal with narrowing. I’m less sure how other languages handle this but I’d guess that narrowing should never add a Const modifier and should preserve existing ones. So if you have x: object and have isinstance list then it should narrow to list. If you have Const[object] and do isinstance list it should narrow to Const[list].

oscarbenjamin · July 18, 2025, 1:45pm

I think we have quite different ideas about where the problems are. On a basic level I don’t see a type checker as giving me a proof of global correctness. The primary benefit of checkers in Python is that they help to avoid basic mistakes so that “most” ways that the code could be wrong are “likely” to be picked up by the checker. I know that in some other languages the proof capability of the type checker is much more powerful but I think we all agree that static typing in Python is never really going to be like that.

Functions like sorter and pathological are not really a concern to me. I know that you don’t intend for them to be realistic but honestly who writes code that takes objects of unknown types and then presumes to be able to mutate a list? Maybe there are basic ways that this can come up in realistic code but this is not a common problem in my experience.

Those examples do demonstrate that type checkers in Python are not really able to provide global guarantees about mutability in the way that e.g. Rust’s type system can. Of course even in Rust you could write your pathological function with unsafe code and that is necessary sometimes so the Rust people even have a name “interior mutability” for this. The fancy name is needed so that people see it as a clever safe use of the type system rather than an unavoidably necessary soundness hole.

Accidentally mutating a list that should not be mutated by mixing up a const vs non-const list is more likely to be a problem in my experience. Needing to convert a list to a const list or vice versa by copying at the boundaries of the code would be reasonable if checkers could track which are const so that the internal code does not need to make unnecessary copies and is prevented from basic mistakes. Preventing most basic mistakes through recursive local analysis can still be very useful even if the checker cannot truly provide global guarantees.

From my perspective a type checker is already unable to guarantee all the necessary invariants without needing anything like the pathological function. The Domain object I referred to can represent different domains whose elements have the same “type” from the perspective of the type system but that are incompatible:

>>> R1 = QQ[x]      # Domain[PolyElement[MPQ]]
>>> R2 = ZZ[x,y]    # Domain[PolyElement[MPZ]]
>>> p1 = R1(x + 1)  # PolyElement[MPQ]
>>> p2 = R2(x - y)  # PolyElement[MPZ]
>>> p1
x + 1
>>> p2
x - y
>>> p1 + p2
...
TypeError: unsupported operand type(s) for +: 'PolyElement' and 'PolyElement'

Note that exception raised is TypeError but the types are both PolyElement. There are two different ways that p1 and p2 are incompatible: different domains and different generators. The first of these can be represented in the type system so e.g. the types are PolyElement[MPZ] and PolyElement[MPQ].

The second incompatibility that the variables are different (x vs x and y) cannot be represented fully in the type system as far as I am concerned^[1]. As long as a type checker allows this then it cannot provide the guarantees that are actually needed:

p: list[PolyElement[Eg]] = [p1, p2]

Mixing up p1 and p2 like this is a much more likely problem in reality than the pathological function. For the most part type checkers can pick up on these things but in full generality they cannot give global guarantees that every list has consistent “types” in the way that I want.

Is it a problem that the type checkers cannot guarantee everything? Note firstly that the code I linked to does not have any type annotations. It was written long before annotations were a thing and the people who wrote it were just very careful to ensure that the necessary invariants were maintained. So it is possible to write correct code without having any type checkers at all.

It is still useful I think though to have annotations here and use a type checker which is why I am adding annotations to that code:

I think that people who are new to the code would find it easier to understand if it had type annotations (even if the annotations look complicated).
Type checkers still pick up on basic mistakes like passing E when you should pass list[E] which are easy mistakes to make.
There are problems with that code that are mainly related to the fact that some parts are not valid for arbitrary domains so it would be useful to track constraints on the the type E and the domain type through the call stack (I expect that this will clean up the design and fix some bugs).

Along the same lines as these if you could use const then it would make sense to use it everywhere in that code. That would help with picking up basic mistakes even if the type checker cannot give a global proof that no pathological function would mutate any of the lists.

If a type checker could understand that const[list[E]] is covariant in E then it would not reject the parts of the code that correctly pass a const[list[Efield]] as argument to a const[list[E]] parameter. This is most important from my perspective: type checkers can help pick up on basic mistakes but first I need them to be able to understand correctly written code without giving false positives everywhere.

This is analogous to shape typing with N-dimensional arrays. You could distinguish 1 vs 2 variables or maybe Literal['x'] vs Literal['y'] but in most code it would just be tuple[str, ...]. ↩︎

rsdenijs · July 18, 2025, 5:01pm

This is so evil. I have been promoting the use of sequence to soft ensure immutability. Is there no natural way for type checkers to forbid this?

beauxq · July 18, 2025, 5:23pm

I think it would be a good option for type-checkers to provide to forbid narrowing from covariant to invariant. But I don’t know of any type checkers that offer that option.

mikeshardmind · July 18, 2025, 5:55pm

sorter there was copied from my post above, and it’s a simplified version of something that I know exists in a real codebase serving millions of users daily. Sorting in place when possible, otherwise sorting externally is sometimes desirable. The actual annotations on that function in the codebase it is in is a union of the accepted iterable types, and not Sequence, and the name of the function is actually sort_prefer_inplace

The point with presenting it was that python’s type system is too lax to add this kind of check. We allow loose promotions of types that lose important information, and programs are checked essentially at function boundaries and not for total consistency.

oscarbenjamin · July 18, 2025, 7:43pm

Presumably the sort_prefer_inplace function would have annotations that disallow passing in a const list if that was possible so that the function boundary check could apply. That is assuming of course good will on the part of the authors rather than them intentionally wanting to trick you into letting them mutate your lists.

The fact that Python’s type system is lax is a necessity and it will never be tightened up to a level that would be robust to the sort of examples presented above. Yes, Python’s version of const would be lax just like the rest of the type system. The position that you seem to take here is that if the type system cannot fully guarantee something then it should not attempt to accommodate for it at all.

Where that leaves us though is not being able to annotate correct code because the checkers are strangely strict about the invariance of list[E] despite leaving many holes in other places. What I want to be able to do is just add annotations to correct code that actually is strictly typed but it can’t be done right now without type: ignore because the type system lacks const which is one of the most common typing constructs in other languages.

mikeshardmind · July 18, 2025, 7:48pm

In this particular case, yes. I don’t want people relying on something like this. Putting it in the type system, knowing the type system cannot do the one thing we are asking of it in this case would be a mistake. At least the sharp edges that exist currently aren’t fundamental to why those type expressions are expressible.

This is certainly a place I would appreciate more being done, but I think in python because of the nature of what we already have, this is something more appropriately handled by a thin runtime object proxy (enforcement, accident avoidance, optional) paired with documentation or some other non-type-system means of indicating intent, but with a clear statement that static analysis provided by the type system cannot support it.

chepner · July 23, 2025, 2:19pm

Franklinvp:

Pure function is a flaky concept
import sys

def f():
  def g():
    return 1
  return g()

f() # 1
sys.setrecursionlimit(2)
f() # RecursionError: maximum recursion depth exceeded
The bars for what is and what isn’t move depending on how much of what is involved in evaluating them one wants to look at.

I give f a pass here. Its definition doesn’t do anything that smacks of impurity. It’s not the function’s fault that the operational semantics of function evaluation itself can be changed at run time.

franklinvp · July 23, 2025, 2:47pm

The

Pure function is a flaky concept

admits

I give f a pass here

It is an example that the following (in the post above my first) is too much to hope for:

I would have no use for a definition of a “pure function” that didn’t guarantee that every invocation of the function returned the same value.

pf_moore · July 23, 2025, 2:56pm

I think that quote was mine, and I don’t consider an exception as “returning a value” (after all, you can get KeyboardInterrupt at any time). So the RecursionError you triggered here doesn’t count (in my view).. I do consider f to be pure.

Having said that, a big part of the problem here is that there’s a level of ambiguity, and “I know it when I see it”, to the whole concept. I don’t think there’s a viable definition of “pure” which can be determined mechanically, and I’m not even sure there’s a useful definition that we could get people to agree on. Given that, I don’t think it’s useful to try to incorporate the concept of a “pure function” into the type system.