I am creating an async FastAPI service that trigger models via post requests.
Each model has a different domain.
It should support high throughput and low latency.
What is the best way managing it? with a single or multiple pools (one per domain).
What are the pros and cons and when to choose which approach?
Thanks