State-of-the-art forensic analysis for video content verification
VeriStream is an advanced video forensic analysis platform that uses 8 distinct AI-powered vectors to assess video authenticity. It combines computer vision, signal processing, biometric analysis, and semantic reasoning to detect deepfakes, splicing, and other manipulations.
In an era of AI-generated content and sophisticated video manipulation, verifying media authenticity is critical:
- Misinformation Prevention: Combat fake news and deepfakes in journalism and social media
- Legal Evidence: Verify video evidence integrity for court proceedings
- Content Authentication: Validate user-generated content for platforms
- Corporate Security: Detect manipulated videos in corporate communications
- Digital Forensics: Investigate potential video tampering in forensic cases
✅ Deepfake Detection - ViT-based AI model
✅ Editorial Integrity - Cut detection & AV sync analysis
✅ Semantic Accuracy - Whisper transcription + CLIP + LLM reasoning
✅ Provenance Verification - Metadata analysis & C2PA validation
✅ Biological Consistency - Blink rate & rPPG heart rate analysis
✅ Physical Laws - RAFT optical flow & lighting analysis
✅ Acoustic Physics - YAMNet environmental sound classification
✅ Adversarial Defense - JPEG artifacts & frequency analysis
Technology: Vision Transformer (ViT) deep learning model
What it detects: AI-generated faces, synthetic artifacts, GAN fingerprints
Use Cases:
- Detecting face-swap deepfakes in political videos
- Identifying AI-generated synthetic media
- Validating authentic source footage
Technology: PySceneDetect, MediaPipe Face Mesh, Audio-Video Correlation
What it detects: Scene cuts, splicing, audio-video desynchronization
Use Cases:
- Identifying edited interview clips taken out of context
- Detecting spliced footage in news reports
- Verifying continuous recording in surveillance videos
Technology: Whisper (speech-to-text), CLIP (vision-language), Ollama LLM
What it detects: Audio-visual mismatches, narrative inconsistencies
Use Cases:
- Detecting dubbed or voice-swapped content
- Identifying mismatched audio overlays
- Validating that visuals match spoken content
Technology: FFprobe, ExifTool, C2PA Cryptographic Verification
What it detects: Missing metadata, timestamp tampering, chain of custody
Use Cases:
- Verifying camera origin and capture device
- Detecting re-encoded or re-compressed videos
- Validating Content Credentials (C2PA) for professional content
Technology: MediaPipe landmarks, Eye Aspect Ratio (EAR), CHROM rPPG
What it detects: Unnatural blink rates, synthetic pulse patterns
Use Cases:
- Detecting face-replacement deepfakes
- Identifying AI-generated human faces
- Validating physiological realism
Technology: RAFT Optical Flow, Depth Estimation, Lighting Analysis
What it detects: Impossible physics, lighting inconsistencies, motion artifacts
Use Cases:
- Detecting object insertion/removal
- Identifying unrealistic motion patterns
- Verifying consistent environmental lighting
Technology: YAMNet (environmental sound classification), Spectral Analysis
What it detects: Background sound inconsistencies, audio splicing
Use Cases:
- Detecting acoustic environment changes (indoor/outdoor jumps)
- Identifying replaced audio segments
- Validating continuous acoustic signatures
Technology: JPEG artifact analysis, FFT frequency domain, Re-compression tests
What it detects: Anti-forensic tampering attempts
Use Cases:
- Detecting adversarial noise injection
- Identifying compression-based obfuscation
- Uncovering sophisticated tampering attempts
- Docker (v20.10+) - Install Docker
- Docker Compose (v2.0+) - Install Docker Compose
- 8GB+ RAM recommended for model loading
- (Optional) NVIDIA GPU with Docker GPU support for Ollama LLM
git clone https://github.com/yourusername/VeriStream.git
cd VeriStream# Build and start all services
docker-compose up --build -d
# Check service status
docker-compose ps
# View logs
docker-compose logs -f- Frontend: http://localhost
- Backend API: http://localhost:8000
- API Docs: http://localhost:8000/docs
- Open http://localhost in your browser
- Click "Upload Video" or drag & drop
- Wait for analysis (typically 40-80 seconds)
- View detailed forensic report and download verified video
The application consists of three services:
- Port: 8000
- Tech: FastAPI, PyTorch, MediaPipe, Whisper, CLIP
- Volumes:
backend_temp- Temporary processing filesbackend_weights- Downloaded model weightsbackend_database- Analysis history database
- Port: 80
- Tech: React 18, Vite, TailwindCSS
- Features: Real-time progress, responsive UI, PDF export
- Port: 11434
- Requirements: NVIDIA GPU recommended
- Enable: Uncomment the
ollamaservice indocker-compose.yml
Create a .env file in the root directory (optional):
# Ollama LLM endpoint (if running separately)
OLLAMA_BASE_URL=http://ollama:11434
# Backend development mode
RELOAD=trueFor local development with hot-reload:
cd veristream_backend
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txt
uvicorn main:app --reload --host 0.0.0.0 --port 8000cd veristream_frontend
npm install
npm run dev┌──────────────────────────────────────────────────────────────────────┐
│ VeriStream Platform │
├──────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌───────────────┐ ┌─────────────┐ │
│ │ Frontend │────────▶│ Backend │◀──────▶│ Models │ │
│ │ React + Nginx│ │ FastAPI │ │ Weights │ │
│ │ Port 80 │ │ Port 8000 │ │ Cache │ │
│ └──────────────┘ └───────┬───────┘ └─────────────┘ │
│ │ │
│ ┌────────────────────────────────┴────────────────────────────-┐ │
│ │ 8 Forensic Analysis Vectors │ │
│ ├──────────────────────────────────────────────────────────────┤ │
│ │ A. Deepfake Detection (ViT Model) │ │
│ │ B. Editorial Integrity (PySceneDetect, AV Sync) │ │
│ │ C. Semantic Accuracy (Whisper, CLIP, Ollama) │ │
│ │ D. Provenance & Metadata (C2PA Verification) │ │
│ │ E. Biological Consistency (rPPG, Blink Analysis) │ │
│ │ F. Physical Laws (RAFT Optical Flow) │ │
│ │ G. Acoustic Physics (YAMNet) │ │
│ │ H. Adversarial Defense (FFT, JPEG Artifacts) │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ Persistent Volumes │ │
│ ├──────────────────────────────────────────────────────────────┤ │
│ │ • backend_temp (Temporary processing files) │ │
│ │ • backend_weights (Downloaded model cache) │ │
│ │ • backend_database (Analysis history database) │ │
│ └──────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────┘
- Backend Flow - Step-by-step execution flow
- Forensic Logic - Detailed scoring algorithm
- API Documentation - Interactive Swagger UI
# Check logs
docker-compose logs backend
docker-compose logs frontend
# Rebuild from scratch
docker-compose down -v
docker-compose up --buildIncrease Docker memory allocation:
- Docker Desktop: Settings → Resources → Memory (set to 8GB+)
Models are downloaded on first use. Ensure:
- Internet connection is active
- HuggingFace is accessible
- Sufficient disk space (5GB+ for models)
# Install and pull model
docker exec -it veristream-ollama ollama pull llama3.1:8b
# Verify
curl http://localhost:11434/api/tags- Verify backend is healthy:
docker-compose ps - Check backend logs:
docker-compose logs backend - Ensure nginx proxy is configured correctly
| Module | Typical Duration | Parallelizable |
|---|---|---|
| Deepfake Detection | 2-5s | ✅ |
| Editorial Integrity | 5-10s | Partial |
| Semantic Accuracy | 10-20s | ✅ |
| Provenance | 1-2s | ✅ |
| Biological | 8-15s | ✅ |
| Physical Laws | 5-10s | ✅ |
| Acoustic Physics | 8-12s | ✅ |
| Adversarial Defense | 3-5s | ✅ |
Total: ~40-80 seconds for a 30-second video (sequential)
Made with ❤️ for video authenticity verification
⭐ Star us on GitHub if this project helped you!



