VCTrain

Enhanced RVC Training System with 20 optimizers, full TypeScript WebUI, Python FastAPI backend, WebSocket real-time monitoring, and Google Colab support.

Colab · Install · WebUI · Optimizers · Workflow

_{Built on PolTrain · Side project of RVC Starter}

Architecture

┌────────────────────────────────────────┐
│         TypeScript Frontend            │
│         Next.js 16 · Port 3000         │
│   Dashboard · Config · Monitor · Guide │
├────────────────────────────────────────┤
│         WebSocket Bridge               │
│         Bun · Port 3003                │
│   REST Proxy + WS Relay + Auto-Reconnect│
├────────────────────────────────────────┤
│         Python Backend                 │
│         FastAPI · Port 7861            │
│   Training · GPU · System · WebSocket  │
└────────────────────────────────────────┘

Layer	Technology	Port	Purpose
Frontend	Next.js 16, React 19, TypeScript 5, Tailwind CSS 4, shadcn/ui	3000	WebUI with 4 tabs
Bridge	Bun, native WebSocket + ws library	3003	REST proxy + WebSocket relay
Backend	Python 3.8+, FastAPI, Uvicorn, PyTorch 2.0+	7861	Training pipeline + GPU management

Google Colab (Free GPU)

Open colab_webui.ipynb in Google Colab and run all cells. It automatically handles everything:

Step	What happens
GPU Check	Detects GPU name, VRAM, and temperature
Install Deps	Installs PyTorch (CUDA), FastAPI, Node.js, and all dependencies
Download Models	Fetches RMVPE + ContentVec pre-trained models
Upload Dataset	Connect Google Drive or upload audio files directly
Start Backend	Launches Python FastAPI server on port 7861
Build Frontend	Installs npm packages and starts Next.js on port 3000
ngrok Tunnels	Creates public URLs for remote access from any device

The entire process is idempotent — safe to re-run any cell. Works with Colab's free T4 GPU (16GB VRAM).

Colab Tips

Use T4 GPU (free) for models up to batch size 8
Set optimizer to AdamW or Ranger for best results on T4
Use Adafactor if you hit OOM errors (lowest VRAM usage)
Connect Google Drive for persistent storage across sessions
300 epochs takes roughly 1-2 hours on T4 depending on dataset size

Local Installation

Prerequisites

Python 3.8+
Node.js 18+ / Bun 1.0+
PyTorch 2.0+ with CUDA support (optional, CPU/MPS also works)
GPU: NVIDIA with CUDA 11.7+ (optional)
RAM: 8GB+ recommended

Step 1: Clone & Install Python Dependencies

git clone https://github.com/BF667-IDLE/VCTrain.git
cd VCTrain

# Install Python ML dependencies
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt

# Install Python backend dependencies
pip install fastapi uvicorn[standard] websockets

Step 2: Install Frontend Dependencies

# Using bun (recommended)
bun install

# Or using npm
npm install

Step 3: Install WebSocket Bridge

cd mini-services/ws-bridge
bun install
cd ../..

Step 4: Start All Services

Open 3 terminal windows:

# Terminal 1 — Python Backend (port 7861)
python -m webui.server

# Terminal 2 — WebSocket Bridge (port 3003)
cd mini-services/ws-bridge && bun run dev

# Terminal 3 — Next.js Frontend (port 3000)
bun run dev

Step 5: Open WebUI

Navigate to http://localhost:3000 in your browser.

Note: The WebUI works in demo mode even without the Python backend running. When the backend is offline, it shows mock data with a "Backend Not Connected" banner. Start the backend to switch to live training mode.

WebUI Features

Dashboard Tab

Real-time experiment list fetched from the filesystem
Active training jobs with live status (running / completed / failed)
GPU monitoring — GPU name, memory usage, CUDA version
Connection indicator — green dot when backend is online, gray when offline
Quick action buttons — New Training, Compare Models, GPU Monitor

Training Config Tab

Complete form matching all train.py CLI arguments:
- Experiment directory, model name, total epochs, save interval, batch size
- Sample rate (32k / 40k / 48k), vocoder (HiFi-GAN / MRF / RefineGAN)
- All 19 optimizer selection with live info panel
- Pretrained model paths, GPU device IDs, save-to-ZIP toggle
Live CLI command preview — see the exact command that will run, with copy button
Starts real training via FastAPI backend when connected
Shows job ID on success and auto-switches to Monitor tab

Training Monitor Tab

Real-time WebSocket metrics — losses, mel similarity, gradient norms, learning rate
4 interactive Recharts:
- Loss Curves (discriminator, generator, mel, KL)
- Mel Spectrogram Similarity (%) over epochs
- Gradient Norms (generator vs discriminator)
- Learning Rate Schedule with cosine decay visualization
Scrollable training log viewer — see raw training output in real-time
Demo mode — shows realistic mock data when backend is offline

Optimizer Guide Tab

6 quick recommendation cards: Best Overall, Fastest, Memory Efficient, Zero LR Tuning, Maximum Quality, Large Batch
All 19 optimizers organized by category with expandable detail cards
Star ratings for Speed, Quality, Memory Efficiency, and Stability
Click any optimizer card to see full details: description, recommended LR range, key feature, best use case

📁 Project Structure

VCTrain/
├── rvc/                          # Core training code (Python)
│   ├── train/
│   │   ├── train.py              # Main training script (20 optimizers)
│   │   ├── utils/
│   │   │   ├── optimizers/       # 20 optimizer implementations
│   │   │   │   ├── Adam.py
│   │   │   │   ├── AdamW.py
│   │   │   │   ├── AdaBelief.py
│   │   │   │   ├── AdaBeliefV2.py
│   │   │   │   ├── Adafactor.py
│   │   │   │   ├── AMSGrad.py
│   │   │   │   ├── Apollo.py
│   │   │   │   ├── CAME.py
│   │   │   │   ├── DAdaptAdam.py
│   │   │   │   ├── LAMB.py
│   │   │   │   ├── Lion.py
│   │   │   │   ├── Lookahead.py
│   │   │   │   ├── NovoGrad.py
│   │   │   │   ├── Prodigy.py
│   │   │   │   ├── RAdam.py
│   │   │   │   ├── Ranger.py
│   │   │   │   ├── SignSGD.py
│   │   │   │   ├── SGD.py
│   │   │   │   └── Sophia.py
│   │   │   ├── train_utils.py
│   │   │   └── data_utils.py
│   │   ├── preprocess/           # Audio preprocessing
│   │   ├── losses.py             # GAN loss functions
│   │   ├── mel_processing.py     # Mel spectrogram processing
│   │   └── visualization.py      # TensorBoard logging
│   ├── lib/                      # Model architectures
│   │   ├── algorithm/            # Synthesizer, discriminator, generator
│   │   └── configs/              # Sample rate configs (32k/40k/48k)
│   └── configs/                  # JSON config templates
│
├── webui/                        # Python Backend (FastAPI)
│   ├── __init__.py
│   ├── server.py                 # FastAPI server (port 7861)
│   └── requirements.txt          # fastapi, uvicorn, websockets
│
├── src/                          # TypeScript Frontend (Next.js)
│   ├── app/
│   │   ├── page.tsx              # Main page with tab navigation
│   │   ├── layout.tsx            # Root layout with QueryProvider
│   │   ├── globals.css           # Theme colors (amber/orange)
│   │   └── api/training/route.ts # CLI command generator API
│   ├── components/
│   │   ├── vctrain/              # Main tab components
│   │   │   ├── dashboard-tab.tsx
│   │   │   ├── training-config-tab.tsx
│   │   │   ├── training-monitor-tab.tsx
│   │   │   └── optimizer-guide-tab.tsx
│   │   └── ui/                   # shadcn/ui components
│   ├── lib/
│   │   ├── api.ts                # REST client + WebSocket hook
│   │   ├── store.ts              # Zustand state management
│   │   ├── query-provider.tsx    # React Query configuration
│   │   └── training-data.ts      # Optimizer definitions + mock data
│   └── types/
│       └── vctrain.ts            # TypeScript interfaces
│
├── mini-services/                # WebSocket Bridge
│   └── ws-bridge/
│       ├── index.ts              # Bridge server (port 3003)
│       ├── package.json
│       └── tsconfig.json
│
├── colab.ipynb                   # Original Colab notebook (CLI)
├── colab_webui.ipynb             # New Colab notebook (WebUI)
├── package.json                  # Frontend dependencies
├── requirements.txt              # Python ML dependencies
└── download_files.py             # Pre-trained model downloader

🔧 Usage

Command Line Training

# Default training with AdamW
python rvc/train/train.py \
  --experiment_dir "experiments" \
  --model_name "my_voice" \
  --optimizer "AdamW" \
  --total_epoch 300 \
  --batch_size 8 \
  --sample_rate 48000 \
  --gpus "0"

# With Ranger for best generalization
python rvc/train/train.py \
  --experiment_dir "experiments" \
  --model_name "my_voice" \
  --optimizer "Ranger" \
  --total_epoch 300 \
  --batch_size 8

# With Prodigy (no LR tuning needed!)
python rvc/train/train.py \
  --experiment_dir "experiments" \
  --model_name "my_voice" \
  --optimizer "Prodigy" \
  --total_epoch 300

# Memory-efficient with Adafactor
python rvc/train/train.py \
  --experiment_dir "experiments" \
  --model_name "my_voice" \
  --optimizer "Adafactor" \
  --total_epoch 300

# Multi-GPU training (GPUs 0 and 1)
python rvc/train/train.py \
  --experiment_dir "experiments" \
  --model_name "my_voice" \
  --optimizer "Sophia" \
  --gpus "0-1"

WebUI Training

Open http://localhost:3000 (or the Colab ngrok URL)
Go to Training Config tab
Fill in model name and adjust parameters
Select your preferred optimizer from the dropdown
Click Start Training — it switches to the Monitor tab automatically
Watch real-time charts and logs update live

Backend API

Method	Endpoint	Description
`GET`	`/api/health`	Backend health check
`POST`	`/api/training/start`	Start a new training job
`GET`	`/api/training/status`	Get all job statuses
`POST`	`/api/training/stop/{job_id}`	Stop a running job
`DELETE`	`/api/training/job/{job_id}`	Delete a job record
`GET`	`/api/experiments`	List filesystem experiments
`GET`	`/api/system/info`	GPU and system info
`GET`	`/api/optimizers`	Available optimizers list
`WS`	`/ws/training/{job_id}`	Real-time training metrics stream

🎯 20 Optimizers with Gradient Centralization

All custom optimizers support:

torch._foreach acceleration for fast vectorized operations
Optional Gradient Centralization (GC) for improved GAN training stability
Decoupled weight decay following Loshchilov & Hutter (2019)
Both single-tensor and foreach step implementations

Adaptive Methods

Optimizer	Description	LR Range	Key Feature
AdamW	Adam + decoupled weight decay + GC	1e-4 to 3e-4	Custom impl with GC
Adam	Classic adaptive optimizer + GC	1e-4 to 3e-4	Fast convergence
AMSGrad	Adam with max variance tracking + GC	1e-4 to 3e-4	Prevents oscillations
RAdam	Rectified Adam + GC	1e-4 to 3e-4	Stable early training
AdaBelief	Belief-based adaptive LR + GC	1e-4 to 3e-4	Better generalization
AdaBeliefV2	AdaBelief + AMSGrad	1e-4 to 3e-4	Very stable, long training
Adafactor	Factored moments, memory efficient	Auto (relative step)	Lowest VRAM usage
NovoGrad	Normalized gradient, per-layer LR	1e-4 to 3e-4	Naturally per-layer adaptive
LAMB	Layer-wise Adaptive Moments	1e-4 to 3e-4	Large-batch training
DAdaptAdam	D-Adaptation for automatic LR	Auto (set lr=1.0)	No LR tuning needed

Sign-Based

Optimizer	Description	LR Range	Key Feature
Lion	Evolved Sign Momentum + GC	1e-5 to 5e-5	Only stores momentum
SignSGD	Sign of momentum + GC	1e-5 to 5e-5	Ultra memory-efficient

Second-Order / Clipped

Optimizer	Description	LR Range	Key Feature
Sophia	Second-order clipping (Sophia-G) + GC	5e-5 to 2e-4	Curvature-aware
CAME	Clipped Absolute Moment Estimation + GC	5e-4 to 1e-3	Dual variance estimates
Apollo	Curvature-aware near-optimal + GC	1e-3 to 1e-2	Approx. second-order

Projection & Hybrid

Optimizer	Description	LR Range	Key Feature
AdamP	Adam with perturbation projection + GC	1e-4 to 3e-4	Anti-filter-noise
Ranger	RAdam + Lookahead + GC	1e-4 to 3e-4	Best generalization
SGD	Nesterov momentum + GC	1e-3 to 1e-2	Strong regularization
Lookahead	Wrapper for any base optimizer	N/A	Enhances any optimizer

Auto-LR Methods

Optimizer	Description	LR Range	Key Feature
Prodigy	Automatic LR via D-Adaptation + GC	Auto (set lr=1.0)	Zero-tuning
DAdaptAdam	D-Adaptation for Adam	Auto (set lr=1.0)	Self-adjusting

⚙️ Optimizer Guide

Quick Recommendations

Use Case	Optimizer	Why
Default / General	AdamW or Ranger	Best overall for RVC
Low VRAM	Adafactor	Factored moments, least memory
Best Quality	Sophia or CAME	Fast convergence, stable
No LR Tuning	Prodigy or DAdaptAdam	Auto-finds optimal LR
Large Batch	LAMB	Trust ratio prevents divergence
Fast Training	Lion or SignSGD	Minimal memory, fast per-step
GAN Stability	Ranger or AdamW + GC	Lookahead + GC
Quick Test	SGD Nesterov	Simple, strong regularization

Performance Comparison

Optimizer	Speed	Quality	Memory	Stability
AdamW	⭐⭐⭐	⭐⭐⭐	⭐⭐	⭐⭐⭐
Adam	⭐⭐⭐	⭐⭐⭐	⭐⭐	⭐⭐⭐
AMSGrad	⭐⭐⭐	⭐⭐⭐	⭐⭐	⭐⭐⭐⭐
RAdam	⭐⭐⭐	⭐⭐⭐	⭐⭐	⭐⭐⭐⭐
Ranger	⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐	⭐⭐⭐⭐⭐
AdaBelief	⭐⭐⭐	⭐⭐⭐⭐	⭐⭐	⭐⭐⭐
AdaBeliefV2	⭐⭐⭐	⭐⭐⭐⭐	⭐⭐	⭐⭐⭐⭐
Adafactor	⭐⭐	⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐
Apollo	⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐
CAME	⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐	⭐⭐⭐⭐⭐
DAdaptAdam	⭐⭐⭐	⭐⭐⭐⭐	⭐⭐	⭐⭐⭐⭐
LAMB	⭐⭐⭐	⭐⭐⭐⭐	⭐⭐	⭐⭐⭐
NovoGrad	⭐⭐⭐	⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐
Prodigy	⭐⭐⭐	⭐⭐⭐⭐	⭐⭐	⭐⭐⭐⭐
Lion	⭐⭐⭐⭐	⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐
SignSGD	⭐⭐⭐⭐⭐	⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐
Sophia	⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐	⭐⭐⭐⭐

📊 Training Workflow

Prepare Data — Collect clean audio files (WAV, 32kHz+). Minimum 10 minutes of speech recommended.
Preprocess — Slice audio, extract features, build filelist. Use command line or WebUI.
Configure — Set parameters in Training Config tab. Choose from 19 optimizers, set epochs, batch size, sample rate, vocoder.
Train — Click Start Training. The backend launches as a subprocess and streams metrics via WebSocket.
Monitor — Watch real-time loss curves, mel similarity, gradient norms, and learning rate in the Monitor tab.
Export — Download trained model weights for inference with RVC.

💡 Tips

Dataset Quality

Use clean audio without background noise
Minimum 10 minutes of speech recommended
Consistent volume levels across samples
Remove silence and breaths for best results

Training

Start with 100 epochs for quick testing
Use 300+ epochs for production quality
Monitor mel similarity (target: 70%+)
Save checkpoints regularly (every 25 epochs by default)

VRAM Optimization

VRAM	Batch Size	Recommended Optimizer
4 GB	2-4	Adafactor or SignSGD
8 GB	4-8	AdamW or Lion
12 GB	8-16	Any optimizer
16+ GB	16-32	Sophia or CAME

Optimizer-Specific Tips

Prodigy / DAdaptAdam: Set lr=1.0, the optimizer auto-adjusts
Lion / SignSGD: Use lower LR than Adam (10x lower typically)
Sophia: Update period of 2-3 steps works best
Ranger: Good default choice, no tuning needed
Adafactor: Uses relative_step=True for automatic LR
CAME: Higher LR (10x base) works best due to clipping

🛠️ Tech Stack

Component	Technology
Frontend Framework	Next.js 16, React 19, TypeScript 5
Styling	Tailwind CSS 4, shadcn/ui (New York)
Charts	Recharts
Animations	Framer Motion
State Management	Zustand, React Query (TanStack Query)
WebSocket Bridge	Bun, native WebSocket + ws library
Python Backend	FastAPI, Uvicorn, PyTorch 2.0+
ML Training	PyTorch DDP, TensorBoard

📚 Documentation

Resource	Description
colab_webui.ipynb	Full WebUI on Google Colab with free GPU
colab.ipynb	Original CLI-based Colab notebook
Optimizers README	Technical optimizer guide
Backend API	REST + WebSocket API reference

🙏 Acknowledgments

PolTrain — Base project
RVC — Voice conversion technology
PyTorch — Deep learning framework
Next.js — React framework
FastAPI — Python web framework
AdamW (Loshchilov & Hutter, 2019)
Lion (Chen et al., 2023)
Sophia (Liu et al., 2023)
RAdam (Liu et al., 2020)
Ranger (Less Wright, 2020)
AdaBelief (Zhuang et al., 2020)
Lookahead (Zhang et al., 2019)
Prodigy (Defazio & Jelassi, 2023)
D-Adapt (Defazio, 2023)
CAME (Luo et al., 2023)
Apollo (Shi et al., 2022)
LAMB (You et al., 2019)
NovoGrad (Golovneva et al., 2019)
SignSGD (Bernstein et al., 2018)

📝 License

Same license as the original PolTrain project.

Happy Training! 🎤

Name		Name	Last commit message	Last commit date
Latest commit History 505 Commits
.github/workflows		.github/workflows
logs/mute		logs/mute
mini-services/ws-bridge		mini-services/ws-bridge
public		public
rvc		rvc
src		src
webui		webui
.gitignore		.gitignore
README.md		README.md
bun.lock		bun.lock
colab.ipynb		colab.ipynb
colab_webui.ipynb		colab_webui.ipynb
components.json		components.json
download_files.py		download_files.py
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package.json		package.json
postcss.config.mjs		postcss.config.mjs
requirements.txt		requirements.txt
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation