🔀 A3M Router

One prompt in. The right model out.

47+ providers. Budget enforcement. Semantic cache. Intelligent failover.

# Your app today — every request goes to GPT-4o
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Explain quantum entanglement" }]
});
❌ $0.003 per request — no fallback if OpenAI goes down
# One line change — A3M routes to the cheapest capable model
const response = await fetch("http://localhost:8787/v1/chat/completions", {
body: JSON.stringify({
model: "auto", // ← A3M picks the best model
messages: [{ role: "user", content: "Explain quantum entanglement" }]
})
});
✅ Routed to: Groq (FREE)
$0.00 per request — automatic fallback chain if Groq fails

🌏 47+ Providers

Chinese Providers (special handling)
Groq (FREE) DeepSeek Kimi Qwen Zhipu Yi Stepfun Moonshot 01.AI Tencent Baidu Alibaba
Global Providers
OpenAI Anthropic Google Mistral Cohere Groq Perplexity AWS Bedrock Azure Replicate HuggingFace
✅ A3M routes to the cheapest healthy provider
Health checks + circuit breakers + intelligent fallback

💰 Save 62% on API Costs

WITHOUT A3M
$0.003
per request (GPT-4o)
$3.41 per 1K requests
WITH A3M
$0.00
47% → Groq (free)
30% → DeepSeek ($0.14/M)
23% → GPT-4o-mini ($0.15/M)
$1.24 per 1K requests
62%
Cost Savings
99.5%
Routing Accuracy
<1ms
Routing Latency
💰 $2,175 saved per 1M requests
At 1000 queries/day: $547 saved yearly

✨ Features

🧠 Adaptive Memory
Learns from your usage. Updates model quality scores with every request. No retraining.
💾 Semantic Cache
Embedding-based lookup. 30%+ hit rate on repeated queries.
🛡️ Budget Enforcement
Per-user/team caps. Alerts at 50%/80%/100%. No surprises.
🔄 Intelligent Failover
Circuit breaker (3 fails → 60s cooldown). Automatic fallback chains.
⚡ Per-Provider Retry
Custom timeout per provider. Exponential backoff. 429 handling.
🎯 12-Signal Routing
Domain, task, structure, action verb, multi-step. Zero ML.
Zero ML. No GPU required. Starts in <100ms.

🔄 Intelligent Failover

[Groq] — Attempting connection... [Groq] — ✗ FAILED — 503 Service Unavailable [Circuit Breaker] — Tripped after 3 failures [DeepSeek] — HEALTHY — Switching... [DeepSeek] — ✅ Response delivered ✓
✅ Your app never knew there was a problem
99.9% uptime — automatic provider health scoring + fallback
📊 Provider Health Monitoring
Latency + error rate → health score → automatic routing
Get started in 10 seconds
npm install -g adaptive-memory-multi-model-router
# Auto-detects your API keys — zero config
npx a3m-router serve
# Now your app: model: "gpt-4o" → model: "auto"
1 line
Code change
0
API keys to manage
<100ms
Startup time
✨ One prompt in. The right model out. ✨