AdaptML: Production-Grade LoRA Fine-Tuning Platform on Cloud Run

A serverless platform for fine-tuning Large Language Models using LoRA/QLoRA on Google Cloud Run with GPU acceleration.

🚀 Features

Serverless GPU Training: Fine-tune Gemma models with Cloud Run Jobs + L4 GPUs
QLoRA Optimization: 4-bit quantization reduces memory usage by 75%
REST API: Easy-to-use API for job submission and monitoring
Automatic Scaling: Pay only for compute time, scale to zero when idle
Model Storage: Trained adapters saved to Cloud Storage

🏗️ Architecture

📊 Technical Specifications

Training Infrastructure:

GPU: NVIDIA L4 (24GB VRAM)
CPU: 8 vCPUs
Memory: 32GB RAM
Region: europe-west1

ML Stack:

Framework: PyTorch 2.1.2
Fine-tuning: PEFT (LoRA)
Quantization: bitsandbytes (4-bit)
Base Model: google/gemma-2b-it

Performance:

Training Time: ~2 minutes (100 samples, 1 epoch)
Cost: ~$0.67/hour GPU usage
Memory Efficiency: 75% reduction with QLoRA

🔧 API Endpoints

GET /

Service information and available endpoints

POST /train

Submit a new training job

{
"model_name": "google/gemma-2b-it",
"lora_rank": 16,
"num_epochs": 1,
"learning_rate": 0.0002
}

GET /status/<execution_id>

Check training job status

GET /models

List all trained models in Cloud Storage

🎨 Web Dashboard

Interactive web interface built with Streamlit for:

Submitting training jobs with custom parameters
Real-time job status monitoring
Viewing trained models and their files

Dashboard Features

📊 Platform statistics
🎯 Visual job submission form
📈 Job status tracking with metrics
📦 Model browser with file listings

Key Metrics

Services Deployed: 3

Streamlit Dashboard (UI)
Flask API (Backend)
Training Job (GPU Worker)

Cost per Training Run: ~$1 Training Time: 2-3 minutes Model Size: 35 MB (LoRA adapters only) Memory Savings: 75% (QLoRA optimization)

🚀 Quick Start

1. Submit Training Job:

curl -X POST https://YOUR-API-URL/train \
-H "Content-Type: application/json" \
-d '{"model_name": "google/gemma-2b-it", "lora_rank": 16}'`

2. Check Status:

curl https://YOUR-API-URL/status/EXECUTION_ID`

3. List Models:

curl https://YOUR-API-URL/models

📦 Project Structure

lora-platform/
├── training/
│ ├── Dockerfile        # GPU training container
│ ├── train_lora.py     # LoRA fine-tuning script
│ ├── requirements.txt  # Python dependencies
│ └── cloudbuild.yaml   # Build configuration
├── frontend/
│ ├── Dockerfile        # API service container
│ ├── app.py            # Flask REST API
│ └── requirements.txt  # API dependencies
└── README.md

🎯 Hackathon Submission

Category: GPU Category
Technologies: Cloud Run Jobs, NVIDIA L4 GPU, LoRA, QLoRA, Gemma
Innovation: Production-ready serverless LLM fine-tuning with cost optimization

💰 Cost Breakdown

Training (per run): $0.50 - $1.00
API Service (idle): $0.00 (scales to zero)
Storage: ~$0.01/month per model
Total for hackathon: <$10

🏆 Key Achievements

✅ GPU-accelerated training on Cloud Run Jobs
✅ 75% memory reduction with QLoRA
✅ RESTful API for job management
✅ Automatic model storage to Cloud Storage
✅ Production-ready error handling
✅ Scales to zero when idle

📝 License

MIT License - Built for Google Cloud Run Hackathon 2025

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
dashboard		dashboard
frontend		frontend
training		training
.gitignore		.gitignore
LICENSE		LICENSE
QUICKSTART_GUIDE.md		QUICKSTART_GUIDE.md
README.md		README.md
SYSTEM_ARCHITECTURE.md		SYSTEM_ARCHITECTURE.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AdaptML: Production-Grade LoRA Fine-Tuning Platform on Cloud Run

🚀 Features

🏗️ Architecture

📊 Technical Specifications

🔧 API Endpoints

GET /

POST /train

GET /status/<execution_id>

GET /models

🎨 Web Dashboard

Dashboard Features

Key Metrics

🚀 Quick Start

📦 Project Structure

🎯 Hackathon Submission

💰 Cost Breakdown

🏆 Key Achievements

📝 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AdaptML: Production-Grade LoRA Fine-Tuning Platform on Cloud Run

🚀 Features

🏗️ Architecture

📊 Technical Specifications

🔧 API Endpoints

GET /

POST /train

GET /status/<execution_id>

GET /models

🎨 Web Dashboard

Dashboard Features

Key Metrics

🚀 Quick Start

📦 Project Structure

🎯 Hackathon Submission

💰 Cost Breakdown

🏆 Key Achievements

📝 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages