Add Text-to-Speech (TTS) pipeline by yangshuwe1 · Pull Request #106 · CAHLR/OATutor

yangshuwe1 · 2026-03-10T22:59:08Z

Overview

This PR adds a full TTS pipeline that converts math content to audio, letting students listen to problem bodies, steps, and hints read aloud.

Changes

New backend pipeline (src/math-to-speech/):

autoTTSProcessor.js — main script that scans all content and generates pacedSpeech fields in hint/step/problem JSON files. Supports incremental updates (skips unchanged content via hash), --force, --dry-run, and per-type filtering (--types hints,steps,problems)
hintProcessor.js — handles LaTeX → MathML (Python/SRE) → readable speech text conversion, with a process pool (4 Python + 8 SRE workers) for fast parallel processing (~39s for 54k hints)
latexToMathML.py, sreNode.js — persistent worker processes for LaTeX conversion

New frontend components:

TTSPlayer — fetches audio from Lambda and handles play/pause/replay/segment chaining
TTSButtons — play/pause/replay button group
latexToReadable.js — lightweight LaTeX → readable text fallback when no pacedSpeech is available

Modified frontend (Problem.js, ProblemCard.js, HintSystem.js):

TTS buttons integrated into problem title, step title, and each hint
All gated behind enableTTS prop

Config (config.js):

Added TTS_API_URL read from REACT_APP_TTS_AWS_ENDPOINT environment variable (mirrors how DYNAMIC_HINT_URL works for the LLM agent)

How to enable

Add "allowTTS": true to a lesson in coursePlans.json — same pattern as allowDynamicHint. TTS is off by default everywhere. Default to be False now, making no change to current website.

Current impact

None on existing pages. All courses' configuration are False now. The content submodule is untouched — pacedSpeech fields don't exist yet, so all lessons behave exactly as before. The Lambda env var is also not configured in production yet, so no AWS calls will be made.

To generate speech text locally:

npm run process-tts          # incremental
npm run process-tts:force    # regenerate everything

Can be integrated into the incremental content update GitHub Actions workflow in the future.

ready for pr

6712fd0

yangshuwe1 self-assigned this Mar 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Text-to-Speech (TTS) pipeline#106

Add Text-to-Speech (TTS) pipeline#106
yangshuwe1 wants to merge 1 commit intomainfrom
shuwei-tts

yangshuwe1 commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yangshuwe1 commented Mar 10, 2026

Overview

Changes

How to enable

Current impact

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant