One prompt in. The right model out.
npm install adaptive-memory-multi-model-router
npx a3m-router serve
Route simple queries to free tiers (Ollama, Groq) or budget providers ($0.59/1M tokens). Save 62% on API costs.
When a provider fails, automatically retry with the next best option. Circuit breaker pattern keeps your app resilient.
Trigram Jaccard similarity cache eliminates redundant API calls with 95%+ hit rate.
Built-in prompt injection detection, PII redaction, content filtering, and rate limiting.
Free: Ollama, CommandCode, LM Studio
Budget ($0.59-0.60/1M): Groq, Cerebras
Mid ($1.50-2.80/1M): DeepSeek, Mistral, Qwen
Premium ($10-30/1M): OpenAI, Anthropic, Google