Do I need async cache for non-async function?

franklinvp · October 28, 2024, 1:05pm

I have a function. It is really an endpoint for a FastAPI application.

import uvicorn
from fastapi import FastAPI

app = FastAPI()

@app.get('/say_hi/{name}')
def say_hi(name: str) -> str:
  return f'Hi, {name}.'

if __name__ == '__main__':
  uvicorn.run(
    'main:app',
    host='0.0.0.0',
    port=45678
  )

I want to cache its values

from functools import cache

@cache
@app.get('/say_hi/{name}')
def say_hi(name: str) -> bool:
  return f'Hi, {name}.'

The function itself is not defined with async def, but FastAPI takes care of running the application asynchronously.

Question: Do I still need to use a cache specifically designed for asynchronous code, like aiocache or aiomcache, or would I only need this for endpoints defined with async def (FastAPI allows for the use of both).

JamesParrott · October 28, 2024, 1:24pm

Cacheing the value of a function that’s intended to check if the server is currently alive defeats the whole point of the function.

franklinvp · October 28, 2024, 1:31pm

Ignore the name of the function. I was too lazy to type and just copied it. It is a function that does some computations, the return value depends only on the arguments, and has no side effects.

JamesParrott · October 28, 2024, 1:37pm

Oh OK no worries. Doesn’t uvicorn work by spawning a new subprocess for each http request? If so, a naive cache won’t be shared between subprocesses, and each would be garbage collected when its subprocess serving the request terminates.

It’s common for webservers to use Redis or something for cacheing, but maybe that’s overkill for this situation if you don’t need persistence (saving the cache to disk) between server reboots.

It doesn’t matter if the return value is deterministic and the same for all clients. But in general it’s non trivial: what if two cache misses for the same key occur at the same time? Which one wins? So I’d look through the async docs, and look at popular async libraries on PyPi to see if such a feature already exists.

franklinvp · October 28, 2024, 2:03pm

My knowledge of it is very limited, but I think that each web worker is a different process, but each request (handled by a given worker) is run in a threadpool. Each worker will have its own cache, yes, but it should persist and be available for all requests being handled by that worker.

At least in this example I see it being reused

import uvicorn
from fastapi import FastAPI
from functools import cache

app = FastAPI()

@cache
def f(x: str) -> str:
  return f'Hi, {x}'

@app.get('/say_hi/{name}')
def say_hi(name: str) -> str:
  hits = f.cache_info().hits
  _result = f(name)
  if f.cache_info().hits > hits:
    print(f'New hits = {f.cache_info().hits} > old hits {hits}')
  return _result

if __name__ == '__main__':
  uvicorn.run(
    'main:app',
    host='0.0.0.0',
    port=45678
  )

But since my knowledge is limited, I don’t know if I am missing taking into account other considerations, when multiple requests arrive.

JamesParrott · October 28, 2024, 2:23pm

Grand. Probably best to have a fixed size lru_cache to avoid going OOM if you’re serving lots of traffic.

MegaIng · October 28, 2024, 2:46pm

Your best bet is probably to use the dedicated “fastapi-cache” package. It will automatically deal with sync vs async stuff.