PowerShell modules for text-to-speech (TTS) and speech-to-text (STT) across multiple providers.
| Module | TTS | STT | Requires |
|---|---|---|---|
| Speech.Windows | Offline SAPI | Offline SAPI | Windows 10/11 |
| Speech.Azure | 400+ neural voices | Real-time streaming | Azure Speech key |
| Speech.OpenAI | 11 multilingual voices | Whisper (batch) | OpenAI API key |
| Speech.Google | Standard/WaveNet/Neural2 | Batch | Google Cloud credential JSON |
| Speech.Core | — | — | (shared config, microphone, output device) |
| Cmdlet | Windows | Linux/macOS |
|---|---|---|
Out-*Speech (all providers) |
Yes | Yes |
Read-AzureSpeech |
Yes | Yes |
Read-GoogleSpeech |
Yes | Yes |
Read-WindowsSpeech |
Yes | No (SAPI) |
Read-OpenAISpeech |
Yes | No (NAudio WinMM) |
# Windows — no setup needed
Out-WindowsSpeech "Hello, world!"
# Azure
Set-AzureSpeechConfig -Key "your-key" -Region "eastus"
Out-AzureSpeech "Hello" -Language en-US
# OpenAI
Set-OpenAISpeechConfig -Key "sk-..."
Out-OpenAISpeech "Hello" -Voice nova
# Google
Set-GoogleSpeechConfig -Credential "path/to/key.json"
Out-GoogleSpeech "Hello"
# Speech recognition (all providers)
$text = Read-WindowsSpeech
$text = Read-AzureSpeech -Language ja-JP
$text = Read-OpenAISpeech -Language ja
$text = Read-GoogleSpeech -Language ja-JPInstall-PSResource SpeechWith PowerShell.MCP, AI can configure everything for you:
Install-PSResource PowerShell.MCP
claude mcp add PowerShell -s user -- "$(Get-MCPProxyPath)"Then just ask:
Install the Az module and help me create an Azure Speech resource.
Help me set up OpenAI Speech. I don't have an API key yet.
Guide me through setting up Google Cloud Speech.
Say 'Hello world' using Windows Speech.
Windows SAPI works offline with zero configuration — the quickest way to get started.
Settings are stored in ~/Documents/PowerShell/Modules/Speech/SpeechConfig.json. API keys are masked when displayed.
Get-SpeechConfig # View all settings
Get-SpeechConfig -Path # Get config file pathProvider setup
# Get key: Azure Portal > Create "Speech" resource > Keys and Endpoint
# Free tier (F0): 0.5M chars TTS + 5h STT / month
Set-AzureSpeechConfig -Key "your-key" -Region "eastus"
Get-AzureSpeech -Locale ja
Set-AzureSpeechConfig -Voice "ja-JP-NanamiNeural"# Get key: https://platform.openai.com/api-keys
Set-OpenAISpeechConfig -Key "sk-..."
Set-OpenAISpeechConfig -Voice nova -Model tts-1# Get credential: Google Cloud Console > IAM > Service Accounts > Create key (JSON)
Set-GoogleSpeechConfig -Credential "C:\path\to\service-account.json"
Get-GoogleSpeech -Language ja-JP
Set-GoogleSpeechConfig -Voice "ja-JP-Neural2-B"# No API key needed. Add voices: Settings > Time & language > Speech
Get-WindowsSpeech
Set-WindowsSpeechConfig -Voice "Microsoft Haruka Desktop"Common options
All Out-*Speech cmdlets accept pipeline input and share these patterns:
# Pipeline
"Line 1", "Line 2" | Out-AzureSpeech
# Output device selection (Tab completion available)
Out-AzureSpeech "Hello" -OutputDevice "Speakers (Realtek)"
Set-SpeechConfig -OutputDevice "Speakers (Realtek)" # persist
# Microphone selection
Read-AzureSpeech -Microphone "Headset Microphone"
Set-SpeechConfig -Microphone "Headset Microphone" # persist
# Parameter > config priority for all settings
Out-AzureSpeech "Hello" -Key "temp-key" -Region "westus" # one-time overrideWith PowerShell.MCP configured, AI can speak and listen through your speakers and microphone:
Let's have a voice conversation in English.
When I type 't', start listening and respond by voice.
Find me a good English voice and play a sample.
Any MCP-compatible client that supports PowerShell.MCP can use Speech modules:
- Claude Code (CLI)
- Claude Desktop
- GitHub Copilot (VS Code)
- Any other MCP-compatible client
Each provider has 4 cmdlets following a consistent pattern:
| Verb | Purpose | Example |
|---|---|---|
Out-*Speech |
Text-to-speech | Out-AzureSpeech "Hello" |
Read-*Speech |
Speech-to-text | $text = Read-AzureSpeech |
Get-*Speech |
List voices | Get-AzureSpeech -Locale ja |
Set-*SpeechConfig |
Configure provider | Set-AzureSpeechConfig -Voice "..." |
Plus shared cmdlets in Speech.Core: Get-SpeechConfig, Set-SpeechConfig, Get-Microphone, Test-Microphone.
Use Get-Help <cmdlet> -Full for detailed documentation.
All 20 cmdlets
Speech.Core — Shared configuration and audio devices
Get-SpeechConfig— Display current configuration (-Pathfor file location)Set-SpeechConfig— Set common settings:-Rate,-Volume,-Language,-Microphone,-OutputDeviceGet-Microphone— List audio input devicesTest-Microphone— Test microphone input level
Speech.Azure — Azure Cognitive Services
Out-AzureSpeech— TTS with SSML prosody (-Rate,-Volume,-Pitch,-Language,-Voice)Read-AzureSpeech— Real-time streaming STT (-Language,-Detailed)Get-AzureSpeech— List 400+ neural voices (-Localeto filter)Set-AzureSpeechConfig— Set-Key,-Region,-Voice,-Pitch
Speech.OpenAI — OpenAI Audio API
Out-OpenAISpeech— TTS with 11 voices (-Voice,-Model,-Speed)Read-OpenAISpeech— Whisper batch STT (-Language,-Model)Get-OpenAISpeech— List available voicesSet-OpenAISpeechConfig— Set-Key,-Voice,-Model,-STTModel
Speech.Google — Google Cloud Speech
Out-GoogleSpeech— TTS with Standard/WaveNet/Neural2 (-Voice,-Language,-Speed)Read-GoogleSpeech— Batch STT (-Language)Get-GoogleSpeech— List available voices (-Languageto filter)Set-GoogleSpeechConfig— Set-Voice,-Credential
Speech.Windows — Windows SAPI
Out-WindowsSpeech— Offline TTS (-Voice,-Rate,-Volume)Read-WindowsSpeech— Offline STT (-Language,-Confidence,-Detailed)Get-WindowsSpeech— List installed SAPI voices (-Cultureto filter)Set-WindowsSpeechConfig— Set-Voice
Most parameters support Tab or Ctrl+Space completion. Voice and language lists are fetched from each provider's API and cached for the session.
| Cmdlet | Tab-completable Parameters |
|---|---|
Out-WindowsSpeech |
-Voice, -OutputDevice |
Out-AzureSpeech |
-Language, -Voice, -OutputDevice |
Out-OpenAISpeech |
-Model, -Voice, -OutputDevice |
Out-GoogleSpeech |
-Language, -Voice, -OutputDevice |
Read-WindowsSpeech |
-Culture, -Microphone |
Read-AzureSpeech |
-Language, -Microphone |
Read-OpenAISpeech |
-Language, -Model, -Microphone |
Read-GoogleSpeech |
-Language, -Microphone |
Get-WindowsSpeech |
-Culture |
Get-AzureSpeech |
-Locale |
Get-GoogleSpeech |
-Language |
Set-*SpeechConfig |
-Voice, -Microphone, -OutputDevice |
# Language narrows the voice list
Out-AzureSpeech "Hello" -Language <Tab> -Voice <Tab>
# → en-US-JennyNeural, en-US-GuyNeural, ...
Out-OpenAISpeech "Hello" -Voice <Tab>
# → alloy, ash, ballad, coral, echo, fable, nova, onyx, sage, shimmer, verse
Read-AzureSpeech -Language <Tab>
# → en-US, ja-JP, zh-CN, ...
Read-OpenAISpeech -Microphone <Tab>
# → Headset Microphone, Microphone Array, ...Common issues
"key not configured" / "credential not configured"
Run the provider's Set-*Config cmdlet. See Get-Help Set-AzureSpeechConfig -Full.
No microphone input
Get-Microphone # List devices
Test-Microphone # Check input level (> 30 = OK)Windows STT not recognizing language Install language pack: Settings > Time & language > Language & region > Add language > "Speech" feature.
Third-party: NAudio (MIT), Azure Speech SDK (MIT).