Voice Mode
Use your voice to interact with Clew Code via microphone input and OpenAI Whisper transcription.
Overview
Voice mode captures audio from your microphone, transcribes it using OpenAI's Whisper API, and feeds the transcribed text to the agent as regular input.
Setup
1. Set OpenAI API Key
export OPENAI_API_KEY=sk-...
2. Install SoX (required for recording)
macOS:
brew install sox
Ubuntu/Debian:
sudo apt-get install sox
Windows:
Download from sox.sourceforge.net
Usage
❯ /voice # activate voice mode
❯ /voice status # check voice session status
❯ /voice off # deactivate voice mode
Once voice mode is active, press F2 to start/stop recording.
How It Works
User presses F2
↓
VoiceRecorder.start() → Microphone → PCM audio chunks
↓
VoiceRecorder.stop() → Buffer.concat → WAV conversion
↓
VoiceWhisper.transcribe() → OpenAI Whisper API
↓
Session.handleInput(transcribedText) → Normal command flow
Key Files
| File | Purpose |
|---|---|
src/tools/voiceRecorder.ts | Microphone recording via node-record-lpcm16 |
src/tools/voiceWhisper.ts | OpenAI Whisper API for speech-to-text |
src/tools/voiceTools.ts | Voice session management (start/stop/transcribe) |