Skip to content

kerby2000/explanation

Repository files navigation

ElevenLabs Russian Voice Note Batch Generator

Small Python project to batch-generate Russian "объяснительные" voice notes using the ElevenLabs HTTP API.

  • Input: scripts/samples.json or scripts/samples.txt
  • Output: out/*.mp3
  • Optional conversion: out_opus/*.ogg (Opus via ffmpeg)

Requirements

  • Python 3.9+
  • ElevenLabs API key
  • Internet access
  • Optional: ffmpeg for Opus conversion

Setup

Windows (PowerShell)

python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt
Copy-Item .env.example .env

Then open .env and set ELEVENLABS_API_KEY.

macOS / Linux

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env

Then edit .env and set ELEVENLABS_API_KEY.

List Available Voices

python list_voices.py

This prints each voice name and voice_id so you can pick a suitable male voice.

Find Russian Male Voices (Voice Library)

Your account list may show mostly default US/UK voices. To search the shared Voice Library:

python list_voices.py --shared --language ru --gender male --page-size 20

This prints both voice_id and owner (public_owner_id).

Add one shared voice to your account:

python list_voices.py --add-owner-id <PUBLIC_OWNER_ID> --add-voice-id <VOICE_ID> --new-name "RU Male 1"

Then verify it appears in your account:

python list_voices.py

Important:

  • Adding shared voices via API requires an API key with voices_write permission.
  • Using Voice Library voices via API requires a paid ElevenLabs plan (free plan returns HTTP 402 for library voices).

Generate Audio (Batch)

python tts_batch.py --voice-id <VOICE_ID>

If ELEVENLABS_VOICE_ID is set in .env, --voice-id is optional.

Useful options:

python tts_batch.py \
  --in scripts/samples.json \
  --out out \
  --format mp3_44100_128 \
  --model eleven_multilingual_v2 \
  --stability 0.5 \
  --similarity 0.8 \
  --style 0.0 \
  --speaker-boost \
  --sleep-ms 0

Notes:

  • --in supports:
    • JSON: {"scripts": ["...", "..."]}
    • TXT: blocks separated by blank lines
  • Filenames are generated as: {index:02d}_{voice}_{slug}.mp3

Optional: Convert MP3 to WhatsApp-like OGG Opus

python convert_to_opus.py

This scans out/*.mp3 and creates out_opus/*.ogg via:

ffmpeg -y -i input.mp3 -c:a libopus -b:a 24k -vbr on output.ogg

If ffmpeg is missing, the script prints installation guidance and exits.

Web Mockup (Paper + Voice Waveform)

A static webpage is included at web/index.html that mimics the original picture style and renders a downsampled waveform from an MP3.

Default audio source:

  • out/compare_liam_best/01_liam_script.mp3

Run a local server from project root:

python -m http.server 8000

Open:

  • http://localhost:8000/web/

Optional: use a different MP3 file via query parameter:

  • http://localhost:8000/web/?audio=../out/free_chris/01_script.mp3

Match Phrase To Target Waveform Pattern

Use the helper script to generate and score candidate texts against a target pattern:

  • quiet intro: ~2-3s
  • loud explanation: ~10-12s
  • short bridge/pause phrase: ~2s
  • calmer ending/apology: ~3-4s

Run:

python match_waveform_phrase.py --voice-id TX3LPaxmHKxFdv7VOQHJ --in scripts/waveform_candidates.txt --out out/wave_match --target-seconds 20

Outputs:

  • Generated candidates: out/wave_match/*.mp3
  • Ranking report: out/wave_match/match_report.json
  • Best text: scripts/one_example_best.txt

Ethical Use

Do not imitate real people without permission. Prefer generic/synthetic voices for parody, fiction, and harmless experimentation.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors