Models
Clew supports 100+ models across 27 providers. Pick the right model for each task — from frontier reasoning to fast local inference.
Model Selection Guide
| Workload | Recommended Models | Context | Notes |
|---|---|---|---|
| Complex architecture & large edits | claude-opus-4-7, claude-sonnet-4-6, gpt-5.5, deepseek-v4-pro | 1M tokens | Best tool calling, strong reasoning, largest context |
| Code review & debugging | claude-sonnet-4-6, gpt-5.5, gemini-3.1-pro | 1M-2M tokens | Good instruction following, reliable output |
| Quick search & summarization | claude-haiku-4-5, deepseek-v4-flash, mistral-small-4, groq/llama-3.3-70b | 128K-200K | Fast, cost-effective, lower latency |
| Offline / air-gapped | ollama/llama4:70b, ollama/llama3.3 | 131K tokens | Fully local, no network needed |
| Cost-sensitive bulk | kilocode/kilo-auto/free, opencode-go/minimax-m2.7, groq/llama-3.1-8b | Varies | Free or minimal cost tiers available |
| Research & reasoning | claude-opus-4-7, deepseek-v4-pro, gemini-3.1-pro | 1M-2M tokens | Deep reasoning, chain-of-thought |
| Vision / multimodal | claude-sonnet-4-6, gpt-5.5, gemini-3.1-pro, deepseek-v4-flash | 1M tokens | Image understanding, document analysis |
Notable Models by Provider
| Provider | Models Available |
|---|---|
| Anthropic | claude-opus-4-7 (1M ctx), claude-sonnet-4-6 (1M ctx, recommended), claude-haiku-4-5 (200K ctx, fast) |
| OpenAI | gpt-5.5 (1M ctx), gpt-5.4 (1M ctx), gpt-5.4-mini (fast) |
| gemini-3.1-pro (2M ctx), gemini-3.1-flash-lite (1M ctx, fast) | |
| DeepSeek | deepseek-v4-pro (1M ctx, MoE), deepseek-v4-flash (1M ctx, fast MoE) |
| xAI | grok-4.3 (128K ctx) |
| Mistral | mistral-medium-3.5, mistral-small-4, mistral-large-3 |
| Groq | llama-3.3-70b, llama-3.1-8b-instant (fast) |
| Ollama | llama4:70b (local), llama3.3 (local), qwen3.6-plus (local) |
| GitHub Copilot | claude-opus-4.7, claude-sonnet-4.6, claude-haiku-4.5, gpt-5.5 |
| OpenCode | claude-opus-4-7, claude-sonnet-4-6, gpt-5.5, gemini-3.1-pro, kimi-k2.6 |
| KiloCode | kilo-auto/free, kilo-auto/balanced, kilo-auto/frontier |
| Cohere | command-a-plus (128K ctx) |
| Perplexity | sonar-pro, sonar, sonar-reasoning-pro |
| Together AI | deepseek-v4, Qwen3.6, Llama-4 |
| NVIDIA NIM | deepseek-v4-pro, glm-5.1, nemotron-3 |
| Cerebras | qwen-3-235b (extremely fast) |
| Moonshot (Kimi) | kimi-k2.6 (128K ctx) |
| Zhipu (GLM) | glm-4.7 (128K ctx) |
| Hugging Face | llama-3.3-70b-instruct |
| Poe | claude-3-7-sonnet |
Model Switching
/model # Interactive picker (recent models at top, grouped by provider)
/model list # List all available models
/model claude-sonnet-4-7 # Switch by full ID
/model sonnet # Alias-based switching
--model opus # CLI flag at startup
The picker shows all models from every provider in named sections — not just the active provider's models. Recent models appear at top with the active provider's default. Select any model from any provider directly.
Model switching is instant — the next conversation turn uses the new model. Previous context is preserved.
Model Aliases
Short aliases are resolved to full model IDs: sonnet, opus, haiku, gpt5, flash, etc.
CLI Flags for Models
| Flag | Description |
|---|---|
--model <name> | Set model at startup (e.g., --model sonnet or --model claude-opus-4-7) |
--effort <level> | Set reasoning effort: low, medium, high, max |
--max-turns <N> | Limit agentic turns in non-interactive mode |
--thinking <mode> | Thinking mode: enabled, adaptive, disabled |
--fallback-model <model> | Fallback when primary model is overloaded (print mode only) |
--task-budget <tokens> | API-side task budget in tokens |
--max-budget-usd <amount> | Maximum spend on API calls (print mode only) |
Model Capabilities
Each model declaration in providers.json includes:
- Model ID and display label
- Context window (
maxContext) and max output tokens (maxOutput) - Tool calling type (
native,none) - Vision support
- Streaming mode (
full,partial) - Reasoning support
- System prompt support
- Tags (for filtering):
fast,verified,recommended,latest,free,vision,tools,native,moe,local