Turn technical reports into audio summaries
FeatureCast is an AI agent system-prompt with simple mcp-server that converts engineering reports and technical documentation into audio segments. Built with Claude but works with any LLM that supports custom agents, it uses an investigative journalism approach to compare what was built against what was specified, then generates scripts optimized for text-to-speech.
Engineering reports are dense and time-consuming. FeatureCast helps by:
- Comparing implementation reports against specifications
- Identifying gaps and questioning unstated assumptions
- Creating audio-friendly scripts in an NPR-style narrative format
- Converting to audio via text-to-speech for hands-free listening
Technical Specification ────────┐
├─► FeatureCast Agent ──► Audio Script ──► TTS ──► Audio File
Engineering Report ─────────────┘
The agent acts as a technical correspondent, analyzing documentation with professional skepticism. It doesn't just summarize—it investigates what's missing, questions claims, and explains why technical decisions matter.
- Docker - For running the TTS server
- Node.js 16+ - For the MCP server
- ffmpeg - For audio processing
- Claude Desktop or similar - For running the agent
The project includes:
- FeatureCast Agent (
feature-caster-agent/): The investigative journalist AI that generates scripts - MCP Server (
featurecast-mcp/): Tool for converting scripts to audio files - TTS Server (
docker-compose.yml): Wyoming-Piper for text-to-speech (easily swappable with ElevenLabs or other TTS services)
-
Start the TTS server:
docker-compose up -d
This runs Wyoming-Piper on port 5000.
-
Set up the MCP server:
cd featurecast-mcp npm install npm run buildSee
featurecast-mcp/README.mdfor detailed MCP configuration with Claude Desktop. -
Configure environment: Create
.envinfeaturecast-mcp/:BASE_PROJECT_PATH=/path/to/your/project TTS_SERVER_URL=http://localhost:5000/api/text-to-speech TTS_VOICE=en_US-hfc_female-medium TTS_TIMEOUT_MS=180000 # Adjust for transcript length # Audio padding (Piper TTS starts abruptly) AUDIO_PREROLL_MS=750 AUDIO_POSTROLL_MS=1000
-
Configure Claude Desktop: Add to
~/Library/Application Support/Claude/claude_desktop_config.json:{ "mcpServers": { "featurecast": { "command": "node", "args": ["/path/to/featurecast-mcp/dist/server.js"], "env": { "TTS_SERVER_URL": "http://localhost:5000/api/text-to-speech", "BASE_PROJECT_PATH": "/path/to/your/project" } } } }Then restart Claude Desktop.
Here's a typical end-to-end flow:
-
You have technical docs:
requirements.md- What the feature should doengineer-report.md- What got built
-
Feed them to the FeatureCast agent:
"I have a PRP spec and an engineer's implementation report. Can you analyze these and create an audio script comparing what was supposed to be built vs what actually got built?" -
Agent generates a script: The agent produces something like:
Engineering Report. Today we're examining the authentication system implementation... The specification called for OAuth2 with refresh tokens, but our investigation reveals the engineer opted for simpler JWT tokens... Let's explore why this matters... -
Convert to audio via MCP: The MCP tool
generate_audio_casttakes the script and:- Sends it to the TTS server
- Adds pre/post-roll silence with ffmpeg
- Saves both the script and audio files
-
Listen to your technical update: You get a
.wavfile that sounds like an NPR segment about your code.
The example-output/ directory contains:
- Sample script files (
.md) - Narrative text optimized for speech - Audio files (
.wav) - The final TTS output
The agent approaches documentation like an NPR correspondent:
- Compares specs against implementation
- Questions missing information
- Explains technical implications
- Maintains professional skepticism
Scripts are written specifically for listening:
- Continuous narrative flow (no bullets or headers)
- Natural speech transitions
- Spelled-out acronyms on first use
- Pacing markers with ellipses
The agent improves through INSIGHTS.md, building a library of:
- Effective narrative techniques
- Proven analogies and explanations
- Communication patterns that work
- Anti-patterns to avoid
├── docker-compose.yml # TTS server configuration
├── example-output/ # Sample scripts and audio
├── feature-caster-agent/
│ ├── system-prompt.md # Agent instructions
│ └── INSIGHTS.md # Learning library
├── featurecast-mcp/ # MCP server for audio generation
│ ├── src/ # TypeScript source
│ ├── README.md # MCP setup guide
│ └── package.json
└── images/ # Documentation assets
- Sprint report reviews during commutes
- Technical documentation for visual fatigue
- Architecture decision records as audio
- Code review summaries for team updates
- Onboarding materials in audio format
Audio documentation offers some advantages:
- Listen while doing other tasks
- Different cognitive processing than reading
- Good for complex technical narratives
- Useful when screen time needs limiting
- TTS: Wyoming-Piper server (Docker) - swap with ElevenLabs API for higher quality
- MCP: Model Context Protocol for Claude Desktop integration
- Audio Processing: ffmpeg for pre/post-roll silence
- Input: Markdown/text technical documentation
- Output: WAV audio files with narrative scripts
- Agent Compatibility: Built with Claude, works with GPT-4, Gemini, or any LLM that supports custom instructions
The project evolves through usage:
- Share narrative techniques that work well
- Contribute example outputs
- Suggest improvements to the investigative approach
- Help refine the audio optimization
MIT License
Making technical documentation listenable, one report at a time.
