Audio to speaker-attributed transcript — extract who said what, when.
uv pip install -e ".[all]"# Full pipeline: transcribe + diarize
voxtract process audio.m4a --json
# Transcribe only (no speaker diarization)
voxtract transcribe audio.m4a --json
# With context hints for better accuracy
voxtract process audio.m4a --context "교합력 센서, 울산대학교" --jsonEnvironment variables (prefix VOXTRACT_):
| Variable | Default | Description |
|---|---|---|
VOXTRACT_DEVICE |
auto |
auto, cuda, cuda:0, cuda:1, cpu |
VOXTRACT_STT_CONTEXT |
"" |
Contextual hints for ASR |
VOXTRACT_CHUNK_MINUTES |
25 |
Audio chunk size for long files |