An AI-powered subtitle translation and localization platform designed for professional workflows.
SubtiTool prioritizes precision, speed, and contextual awareness through a core-centric interface.
- Introduction
- Key Features
- System Architecture
- Technical Stack
- Getting Started
- Usage Guide
- AI Localization Engine
- Project Portability (.stproj)
- Keyboard Shortcuts
- Media Handling
- API Reference
- Project Structure
- Contributing
- License
SubtiTool is a specialized environment for subtitle translation that bridges the gap between raw machine translation and professional human refinement. It utilizes Large Language Models (LLMs) like Gemini API to maintain narrative flow across segments while providing a high-performance virtualized editor for massive subtitle files.
- Semantic Context Awareness: Batch processing with overlapping segments ensures consistent terminology and tone throughout the project.
- Multi-Engine Support: Integration with Gemini Pro, Google Translate, and LibreTranslate.
- Resilient Pipeline: Per-line retries with exponential backoff and automatic failover mechanisms.
- Smart Retranslation: Individual segment re-processing with custom hints or glossary enforcement.
- Virtualized Rendering: Support for thousands of subtitle rows without performance degradation using a virtual scrolling architecture.
- Smart Timing Hub: Floating action bar for real-time playhead "punch-in" to set start and end timecodes with frame accuracy.
- Auto-Scroll Synchronization: Intelligent editor positioning that follows video playback to keep the active row centered.
- AI Snippet Refinement: Direct integration to shorten or rephrase specific text selections via AI prompts tailored for Netflix-style CPS (Characters Per Second) standards.
- ACID Compliant Storage: Persistent project management using SQLite.
- Idempotent Operations: Guarded state transitions to prevent duplicate actions or data race conditions.
- Offline Resilience: Foreground auto-save with background sync status indicators.
- Frontend: React 19, Zustand (State Management), Wavesurfer.js (Waveform Visualization), Vanilla CSS (Custom Design System).
- Backend: Python 3.9+, FastAPI, SQLAlchemy, BackgroundTasks for asynchronous processing.
- Media: FFmpeg integration for high-performance 480p video proxy generation.
- Database: SQLite for lightweight, zero-configuration persistence.
- Node.js version 18 or higher.
- Python version 3.9 or higher.
- FFmpeg installed on the system path (required for video proxy features).
- Gemini API Key (recommended for advanced AI features).
-
Clone the repository:
git clone https://github.com/awpetrik/SubtiTool.git cd SubtiTool -
Configure the Backend:
cd backend python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate pip install -r requirements.txt uvicorn main:app --reload --port 8000
-
Configure the Frontend:
cd ../frontend npm install npm run dev
- Navigate to the Dashboard.
- Upload a
.srtfile. - Select the source and target languages.
- Input a project description to provide context for the AI engine (e.g., "Genre: Horror, Tone: Informal").
- Click "Create Project" to initiate the background translation process.
- Monitor live progress via the completion bar.
- Segments will transition from
pendingtoai_doneas they are processed. - Use the Filter sidebar to focus on specific states like
flaggedorin_review.
- Double-click any translation cell to enter Edit Mode.
- Right-click a segment to access the Context Menu for quick actions.
- Use the Smart Timing Hub (appears at the bottom when a video is loaded) to sync timecodes with the video playhead.
- Highlight a specific word or phrase in the translation cell.
- A floating AI menu will appear.
- Select Shorten to reduce text length while maintaining meaning.
- Select Rephrase to improve natural flow based on the project context.
SubtiTool utilizes Google's Gemini API models with a specialized orchestration layer to achieve human-like localization accuracy. Unlike standard machine translation, our engine operates through several key technical layers:
The engine is fed with high-level metadata before processing any segments. This includes the movie title, genre, character descriptions, and target audience tone. This allows the AI to distinguish between formal/informal address (e.g., lo/gue vs saya/anda) and maintain era-appropriate vocabulary.
To prevent "translation amnesia," where pronouns or subplots are forgotten between rows, SubtiTool processes subtitles in batches of 50 lines with a 5-line context overlap. This ensures the model has a "short-term memory" of the previous dialogue exchange.
User-defined glossary entries are injected into the system prompt with a "Mandatory Enforcement" instruction. This overrides the model's default dictionary, ensuring proprietary names, brand terms, or specific localization choices remain consistent across 100% of the project.
The engine is dynamically instructed to maintain a maximum of 17 Characters Per Second (CPS). If a translation is linguistically correct but too long for the segment's duration, the AI will automatically rephrase or condense the text into a more readable version without losing the core meaning.
SubtiTool uses a custom .stproj format (JSON-based) to ensure your internal project state—including translation status, flags, and glossary entries—is fully portable across different installations or backups.
Unlike standard .srt files which only contain timecodes and text, an .stproj file encapsulates:
- Project Metadata: Title, source/target languages, and creation timestamps.
- Glossary: All project-specific terminology and translation notes.
- Extended Row Data: Current translation status (
ai_done,flagged, etc.), CPA/CPS calculations, and review flags. - Session State: Your last active row, currently applied filters, and bookmarked segments.
While .srt is the final export format for players, .stproj should be used for saving work-in-progress. It allows you to move your project to another computer or restore it after a database reset without losing your organizational progress.
Press ? inside the editor for the full interactive shortcut reference.
| Category | Key | Action |
|---|---|---|
| Navigation | J / K |
Move Active Row |
| Navigation | G G / G |
Jump to Start / End |
| Playback | Space |
Play / Pause Video |
| Playback | [ / ] |
Set Start / End Timecode (Active Row) |
| Editing | Enter |
Save and Move to Next |
| Editing | Tab |
Save and Edit Next |
| Actions | A |
Approve Segment |
| Actions | X |
Toggle Selection Mode |
SubtiTool generates specialized 480p H.264 video proxies for smooth playback during editing.
- Large video files will prompt a conversion request.
- Proxy files are stored in
backend/temp_proxiesand served as static assets. - Cleaning the cache can be done manually or will occur according to server-side retention policies.
SubtiTool provides a RESTful API for project management and background translation coordination.
- GET
/api/projects: List all saved projects with completion statistics. - POST
/api/projects: Create a manual project entry (non-translation). - GET
/api/projects/{id}: Retrieve full project details, including all segments and glossary. - PATCH
/api/projects/{id}/segments/{seg_id}: Update a specific segment's translation, status, or flag notes. - DELETE
/api/projects/{id}: Permanently remove a project and all associated data.
- POST
/api/translate: Initiate a background translation job. Requiresmultipart/form-dataincluding the.srtfile and project metadata. - GET
/api/translate/{project_id}/progress: A Server-Sent Events (SSE) endpoint to stream real-time progress updates for ongoing jobs. - POST
/api/translate/{project_id}/refine: AI-powered snippet refinement for selected text (Shorten/Rephrase). - POST
/api/translate/{project_id}/retranslate/{seg_id}: Trigger an engine retranslation for a single segment.
- POST
/api/export/{project_id}: Generate and download the finished subtitle file. Supports SRT format with optional original/translation layout.
SubtiTool/
├── backend/
│ ├── main.py # Application Entry Point
│ ├── routers/ # API Endpoints (Project, Translate, Proxy)
│ ├── services/ # Business Logic (Engines, Parsers, FFmpeg)
│ └── database.db # Persistence Layer
├── frontend/
│ ├── src/
│ │ ├── pages/ # View Logic (Upload, Editor)
│ │ ├── components/ # Reusable UI Elements
│ │ └── store/ # Zustand Global State
│ └── package.json # Frontend Dependencies
└── README.md # Documentation
Please review the issue tracker before submitting pull requests. Ensure all code adheres to the existing SOLID and DRY principles established in the codebase.
This project is licensed under the GNU Affero General Public License.