Cache decorator design

Hi there,
I’m adding decorator support to my cache library Theine. Because Theine only support string keys, user must specify how this key is generated from decorated function arguments.
Here is my solution for this, use another function to generate key:

@Cache("tlfu", 1000)
def foo(a:int, b:str, c:Foo) -> Dict:
    ...

@foo.key
def _(a:int, b:str, c:Foo) -> str:
    return f"a:{a}:b:{b}:c:{c.id}"

Pros

  • Type checked. Mypy can check key function to make sure it has same input signature as original function and return a string.
  • User can control how key is generated.
  • Reusable. User knows exactly what key is and they can call low level Theine API to get data.

Cons

  • Two functions are required.

I already create a draft PR for this solution, you can take a look if interested cache decorator

Another option is inline the key function to first decorator:

@Cache("tlfu", 1000, key=lambda args, kwargs: f"xxx")
def foo(a:int, b:str, c:Foo) -> Dict:
    ...

or

@Cache("tlfu", 1000, key="a:{a}:b:{b}:c:{c.id}")
def foo(a:int, b:str, c:Foo) -> Dict:
    ...

Pros

  • One function only.

Cons

  • No type check.
  • Not explicitly.
  • Not flexible enough because of lambda/format string.

So which one do you prefer? Or if you have better idea please tell me.

One approach could be to write a default key function, so most users don’t need to specify them:

import typing as ty
import inspect

P = ty.ParamSpec('P')
R = ty.TypeVar('R')

def default_key(func: ty.Callable[P, R]) -> ty.Callable[P, str]:
    params = list(inspect.signature(func).parameters)
    def key(*args: P.args, **kwargs: P.kwargs):
        all_args = args + tuple(kwargs[p] for p in params[len(args):])
        return ":".join("{}:{}".format(param, arg) for param, arg in zip(params, all_args))
    return key

This will get you something on pretty much anything that doesn’t raise an exception in __str__, if not always something pretty:

>>> def f(x, y):
...     return (x, y)
...
>>> fkey = default_key(f)
>>> class A: ...
>>> fkey(1, A())
'x:1:y:<__main__.A object at 0x7f6203b5bfa0>'

The key function could certainly be improved (for example, adding type(arg) to the string) to avoid false hits. Combine with a mechanism to override the key function and to apply it to a inputs, and you get all of your pros for option 1 without hitting the con unless the user has a specific-enough case to warrant it.

My consideration is if I add this default function, users will blindly think this will work and never use the key decorator way. But str or similar approaches may lead to wrong result if some one override it. If there is a perfect way to generate key from function arguments, maybe I can use that as default.

Actually I have an idea of near perfect function. Python’s lru cache use a fancy HashedSeq to build key automatically. I can create a Dict[HashedSeq, UUID], use the uuid as my string cache key, and find the uuid based on function arguments on get. Cons of this is firstly performance cost, secondly HashedSeq will inclule function arguments, may consume much memory.