ScreenSnap

A screen capture and translation tool built with React, Vite, and Tailwind CSS v4. Uses Windows OCR for text extraction and Ollama for translation.

Architecture

┌─────────────┐      ┌──────────────┐      ┌─────────────┐
│   Browser   │──────▶ Express      │──────▶ Windows    │
│  (React UI) │◀─────│ Server       │◀─────│ OCR API     │
└─────────────┘      └──────────────┘      └─────────────┘
                            │
                            ▼
                     ┌──────────────┐
                     │ Ollama       │
                     │ (Translation)│
                     └──────────────┘

Features

🖥️ Screen Sharing - Share your screen and capture frames
🔤 Windows OCR - Extract text using native Windows OCR (requires Windows 10/11)
🌐 Translation - Translate extracted text using local Ollama models
🌙 Dark Mode - Toggle between light and dark themes
⌨️ Keyboard Shortcuts - Quick actions with hotkeys
📸 Capture Gallery - View all captured frames with translations

Prerequisites

Node.js 18+
Windows 10/11 with OCR language pack installed
Ollama running locally with a text model (e.g., gemma3:4b)

Install Windows OCR Language Pack

Open Windows Settings → Time & Language → Language & Region
Add a language (e.g., German, French)
Click the language → Language options → Install "Optical Character Recognition"

Installation

# Clone the repository
git clone https://github.com/yourusername/screen-translator.git
cd screen-translator

# Install dependencies
npm install

Running the App

Option 1: Run both client and server (recommended for development)

npm run dev

This starts:

Express server on http://localhost:3001
Vite dev server on http://localhost:5173

Option 2: Run separately

Terminal 1 - Server:

npm run server

Terminal 2 - Client:

npm run client

Ollama Setup

Make sure Ollama is running locally on port 11434. Install a text model:

ollama pull gemma3:4b

Or use a dedicated translation model:

ollama pull translate-gemma

API Endpoints

The Express server provides these endpoints:

Endpoint	Method	Description
`/api/health`	GET	Health check
`/api/ocr`	POST	Extract text from image using Windows OCR
`/api/translate`	POST	Translate text using Ollama
`/api/extract-and-translate`	POST	OCR + Translate in one call

Example: Extract and Translate

curl -X POST http://localhost:3001/api/extract-and-translate \
  -H "Content-Type: application/json" \
  -d "{
    "image": "data:image/png;base64,iVBORw0...",
    "sourceLang": "German",
    "targetLang": "English"
  }"

Response:

{
  "extractedText": "Hallo Welt",
  "translation": "Hello World",
  "rawOutput": "Hello World"
}

Keyboard Shortcuts

Shortcut	Action
`Shift + Enter`	Start/Stop screen sharing
`Alt + S`	Swap source/target languages
`Ctrl + L`	Clear all captures
`Alt + C`	Copy selected translation
`Ctrl + Alt + S`	Capture current frame

Build for Production

npm run build

The built files will be in the dist/ directory.

Project Structure

screen-translator/
├── server/                 # Express server
│   ├── index.js           # Main server file
│   ├── ocr.ps1            # Windows OCR PowerShell script
│   └── WindowsOcr.cs        # C# OCR reference
├── src/                    # React frontend
│   ├── components/         # React components
│   ├── hooks/              # Custom hooks
│   └── utils/              # Utilities
├── package.json
├── vite.config.js
└── README.md

Tech Stack

Frontend: React, Vite, Tailwind CSS v4
Backend: Express.js, Node.js
OCR: Windows.Media.Ocr (Windows 10/11 native API)
Translation: Ollama (local AI)

Troubleshooting

"No OCR language pack installed"

Install an OCR language pack in Windows Settings:

Settings → Time & Language → Language & Region
Add the source language you want to OCR
Install "Optical Character Recognition" for that language

"Ollama connection refused"

Make sure Ollama is running:

ollama serve

Or start Ollama from the application menu.

Server won't start on Windows

Run PowerShell as Administrator and set execution policy:

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
legacy		legacy
server		server
src		src
windows media ocr cli @ e03728e		windows media ocr cli @ e03728e
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
vite.config.js		vite.config.js
windows_media_ocr_cli.exe		windows_media_ocr_cli.exe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ScreenSnap

Architecture

Features

Prerequisites

Install Windows OCR Language Pack

Installation

Running the App

Option 1: Run both client and server (recommended for development)

Option 2: Run separately

Ollama Setup

API Endpoints

Example: Extract and Translate

Keyboard Shortcuts

Build for Production

Project Structure

Tech Stack

Troubleshooting

"No OCR language pack installed"

"Ollama connection refused"

Server won't start on Windows

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ScreenSnap

Architecture

Features

Prerequisites

Install Windows OCR Language Pack

Installation

Running the App

Option 1: Run both client and server (recommended for development)

Option 2: Run separately

Ollama Setup

API Endpoints

Example: Extract and Translate

Keyboard Shortcuts

Build for Production

Project Structure

Tech Stack

Troubleshooting

"No OCR language pack installed"

"Ollama connection refused"

Server won't start on Windows

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages