CodeTrans — AI-Powered Code Translation

An AI-powered full-stack application that translates source code between programming languages. Paste code (or upload a PDF), pick your source and target languages, and get idiomatic translated output in seconds — powered by any OpenAI-compatible LLM endpoint or a locally running Ollama model.

Project Overview

CodeTrans demonstrates how code-specialized large language models can be used to translate source code between programming languages. It supports six languages — Java, C, C++, Python, Rust, and Go — and works with any OpenAI-compatible inference endpoint or a locally running Ollama instance.

This makes CodeTrans suitable for:

Enterprise deployments — connect to a GenAI Gateway or any managed LLM API
Air-gapped environments — run fully offline with Ollama and a locally hosted model
Local experimentation — quick setup on a laptop with GPU-accelerated inference
Hardware benchmarking — measure SLM throughput on Apple Silicon, CUDA, or Intel Gaudi hardware

How It Works

The user pastes code or uploads a PDF in the browser.
The React frontend sends the source code and language selection to the FastAPI backend.
If a PDF was uploaded, a text extraction service pulls the code out of the document.
The backend constructs a structured prompt and calls the configured LLM endpoint (remote API or local Ollama).
The LLM returns the translated code, which is displayed in the output panel.
The user copies the result with one click.

All inference logic is abstracted behind a single INFERENCE_PROVIDER environment variable — switching between providers requires only a .env change and a container restart.

Architecture

The application follows a modular two-service architecture with a React frontend and a FastAPI backend. The backend handles all inference orchestration, PDF extraction, and optional LLM observability tracing. The inference layer is fully pluggable — any OpenAI-compatible remote endpoint or a locally running Ollama instance can be used without any code changes.

Architecture Diagram

graph TB
    subgraph "User Interface (port 3000)"
        A[React Frontend]
        A1[Code Input]
        A2[PDF Upload]
        A3[Language Selection]
    end

    subgraph "FastAPI Backend (port 5001)"
        B[API Server]
        C[PDF Service]
        D[API Client]
    end

    subgraph "Inference - Option A: Remote"
        E[OpenAI / Groq / OpenRouter<br/>Enterprise Gateway]
    end

    subgraph "Inference - Option B: Local"
        F[Ollama on Host<br/>host.docker.internal:11434]
    end

    A1 --> B
    A2 --> B
    A3 --> B
    B --> C
    C -->|Extracted Code| B
    B --> D
    D -->|INFERENCE_PROVIDER=remote| E
    D -->|INFERENCE_PROVIDER=ollama| F
    E -->|Translated Code| D
    F -->|Translated Code| D
    D --> B
    B --> A

    style A fill:#e1f5ff,color:#000
    style B fill:#fff4e1,color:#000
    style E fill:#e1ffe1,color:#000
    style F fill:#f3e5f5,color:#000

Architecture Components

Frontend (React + Vite)

Side-by-side code editor with language pill selectors for source and target
PDF drag-and-drop upload that populates the source panel automatically
Real-time character counter and live status indicator
Dark mode (default) with localStorage persistence and flash prevention
One-click copy of translated output
Nginx serves the production build and proxies all /api/ requests to the backend

Backend Services

API Server (server.py): FastAPI application with CORS middleware, request validation, and routing
API Client (services/api_client.py): Handles both inference paths — text completions for remote endpoints and chat completions for Ollama — with token-based auth support
PDF Service (services/pdf_service.py): Extracts code from uploaded PDF files using pattern recognition

External Integration

Remote inference: Any OpenAI-compatible API (OpenAI, Groq, OpenRouter, GenAI Gateway)
Local inference: Ollama running natively on the host machine, accessed from the container via host.docker.internal:11434

Service Components

Service	Container	Host Port	Description
`transpiler-api`	`transpiler-api`	`5001`	FastAPI backend — input validation, PDF extraction, inference orchestration
`transpiler-ui`	`transpiler-ui`	`3000`	React frontend — served by Nginx, proxies `/api/` to the backend

Ollama is intentionally not a Docker service. On macOS (Apple Silicon), running Ollama in Docker bypasses Metal GPU (MPS) acceleration, resulting in CPU-only inference. Ollama must run natively on the host so the backend container can reach it via host.docker.internal:11434.

Typical Flow

User enters code or uploads a PDF in the web UI.
The backend validates the input; PDF text is extracted if needed.
The backend calls the configured inference endpoint (remote API or Ollama).
The model returns translated code, which is displayed in the right panel.
User copies the result with one click.

Get Started

Prerequisites

Before you begin, ensure you have the following installed and configured:

Docker and Docker Compose (v2)
- Install Docker
- Install Docker Compose
An inference endpoint — one of:
- A remote OpenAI-compatible API key (OpenAI, Groq, OpenRouter, or enterprise gateway)
- Ollama installed natively on the host machine

Verify Installation

docker --version
docker compose version
docker ps

Quick Start (Docker Deployment)

1. Clone the Repository

git clone https://github.com/cld2labs/CodeTrans.git
cd CodeTrans

2. Configure the Environment

cp .env.example .env

Open .env and set INFERENCE_PROVIDER plus the corresponding variables for your chosen provider. See LLM Provider Configuration for per-provider instructions.

3. Build and Start the Application

# Standard (attached)
docker compose up --build

# Detached (background)
docker compose up -d --build

4. Access the Application

Once containers are running:

Frontend UI: http://localhost:3000
Backend API: http://localhost:5001
API Docs (Swagger): http://localhost:5001/docs

5. Verify Services

# Health check
curl http://localhost:5001/health

# View running containers
docker compose ps

View logs:

# All services
docker compose logs -f

# Backend only
docker compose logs -f transpiler-api

# Frontend only
docker compose logs -f transpiler-ui

6. Stop the Application

docker compose down

Local Development Setup

Run the backend and frontend directly on the host without Docker.

Backend (Python / FastAPI)

cd api
python -m venv .venv
source .venv/bin/activate        # Windows: .venv\Scripts\activate
pip install -r requirements.txt
cp ../.env.example ../.env       # configure your .env at the repo root
uvicorn server:app --reload --port 5001

Frontend (Node / Vite)

cd ui
npm install
npm run dev

The Vite dev server proxies /api/ to http://localhost:5001. Open http://localhost:5173.

Project Structure

CodeTrans/
├── api/                        # FastAPI backend
│   ├── config.py               # All environment-driven settings
│   ├── models.py               # Pydantic request/response schemas
│   ├── server.py               # FastAPI app, routes, and middleware
│   ├── services/
│   │   ├── api_client.py       # LLM inference client (remote + Ollama)
│   │   └── pdf_service.py      # PDF text and code extraction
│   ├── Dockerfile
│   └── requirements.txt
├── ui/                         # React frontend
│   ├── src/
│   │   ├── App.jsx
│   │   ├── components/
│   │   │   ├── CodeTranslator.jsx   # Main editor panel
│   │   │   ├── Header.jsx
│   │   │   ├── PDFUploader.jsx
│   │   │   └── StatusBar.jsx
│   │   └── main.jsx
│   ├── Dockerfile
│   └── vite.config.js
├── docs/
│   └── assets/                 # Documentation images
├── docker-compose.yaml         # Main orchestration file
├── .env.example                # Environment variable reference
└── README.md

Usage Guide

Translate code:

Open the application at http://localhost:3000.
Select the source language using the pill buttons at the top-left.
Select the target language using the pill buttons at the top-right.
Paste or type your code in the left panel.
Click Translate Code.
View the result in the right panel and click Copy to copy it to the clipboard.

Upload a PDF:

Scroll to the Upload PDF section below the code panels.
Drag and drop a PDF file, or click to browse.
Code is extracted automatically and placed in the source panel.
Select your languages and translate as normal.

Dark mode:

The app defaults to dark mode. Click the theme toggle in the header to switch to light mode. Your preference is saved in localStorage.

Performance Tips

Use the largest model your hardware can sustain. codellama:34b produces the best translation quality; codellama:7b is faster and good for benchmarking.
Lower LLM_TEMPERATURE (e.g., 0.1) for more deterministic, literal translations. Raise it slightly (e.g., 0.3–0.5) if you want more idiomatic rewrites.
Keep inputs under MAX_CODE_LENGTH. Shorter, focused snippets translate more accurately than entire files. Split large files by class or function.
On Apple Silicon, always run Ollama natively — never inside Docker. The MPS (Metal) GPU backend delivers 5–10x the throughput of CPU-only inference.
On Linux with an NVIDIA GPU, set CUDA_VISIBLE_DEVICES before starting Ollama to target a specific GPU.
For enterprise remote APIs, choose a model with a large context window (≥16k tokens) to avoid truncation on longer inputs.

Inference Benchmarks

The table below compares inference performance across different providers, deployment modes, and hardware profiles using a standardized code-translation workload (averaged over 3 runs).

Provider	Model	Deployment	Context Window	Avg Input Tokens	Avg Output Tokens	Avg Tokens / Request	P50 Latency (ms)	P95 Latency (ms)	Throughput (req/s)	Hardware
Ollama	`qwen3:4b-instruct`	Local	8K	218	210.3	428.3	10,361	10,521	0.1186	Apple Silicon (Metal) (Macbook Pro M4)
vLLM	`Qwen3-4B-Instruct-2507`	Local	4K	218	211.3	429.3	11,965	18,806	0.0706	Apple Silicon (Metal) (Macbook Pro M4)
Intel OPEA EI	`Qwen/Qwen3-4B-Instruct-2507`	Enterprise (On-Prem)	8.1K	218	211.7	429.7	12,732	13,277	0.1036	CPU-only (Xeon)
OpenAI (Cloud)	`gpt-4o-mini`	API (Cloud)	128K	216.7	204.7	421.3	4,563	6,969	0.2126	N/A

Notes:

Context Window for Ollama (8K) and vLLM (4K) reflects the LLM_MAX_TOKENS / --max-model-len used during benchmarking, not the model's native 262K context. vLLM shares its 4K context between input and output tokens.

All benchmarks use the same CodeTrans translation prompt and identical inputs (3 runs: small python→java, medium python→rust, large python→go). Token counts may vary slightly per run due to non-deterministic model output.

Ollama on Apple Silicon uses Metal (MPS) GPU acceleration — running it inside Docker would fall back to CPU-only inference. The qwen3:4b-instruct tag must be used (not qwen3:4b) to disable the default thinking mode.

vLLM on Apple Silicon uses vllm-metal — the standard pip install vllm does not support macOS.

Intel OPEA Enterprise Inference runs on Intel Xeon CPUs without GPU acceleration.

Model Capabilities

Qwen3-4B-Instruct-2507

A 4-billion-parameter open-weight code model from Alibaba's Qwen team (July 2025 release), designed for on-prem and edge deployment.

Attribute	Details
Parameters	4.0B total (3.6B non-embedding)
Architecture	Transformer with Grouped Query Attention (GQA) — 36 layers, 32 Q-heads / 8 KV-heads
Context Window	262,144 tokens (256K) native
Reasoning Mode	Non-thinking only (Instruct-2507 variant). Separate Thinking-2507 variant available with always-on chain-of-thought
Tool / Function Calling	Supported; MCP (Model Context Protocol) compatible
Structured Output	JSON-structured responses supported
Multilingual	100+ languages and dialects
Code Benchmarks	MultiPL-E: 76.8%, LiveCodeBench v6: 35.1%, BFCL-v3 (tool use): 61.9
Quantization Formats	GGUF (Q4_K_M ~2.5 GB, Q8_0 ~4.3 GB), AWQ (int4), GPTQ (int4), MLX (4-bit ~2.3 GB)
Inference Runtimes	Ollama, vLLM, llama.cpp, LMStudio, SGLang, KTransformers
Fine-Tuning	Full fine-tuning and adapter-based (LoRA); 5,000+ community adapters on HuggingFace
License	Apache 2.0
Deployment	Local, on-prem, air-gapped, cloud — full data sovereignty

GPT-4o-mini

OpenAI's cost-efficient multimodal model, accessible exclusively via cloud API.

Attribute	Details
Parameters	Not publicly disclosed
Architecture	Multimodal Transformer (text + image input, text output)
Context Window	128,000 tokens input / 16,384 tokens max output
Reasoning Mode	Standard inference (no explicit chain-of-thought toggle)
Tool / Function Calling	Supported; parallel function calling
Structured Output	JSON mode and strict JSON schema adherence supported
Multilingual	Broad multilingual support
Code Benchmarks	MMMLU: ~87%, strong HumanEval and MBPP scores
Pricing	$0.15 / 1M input tokens, $0.60 / 1M output tokens (Batch API: 50% discount)
Fine-Tuning	Supervised fine-tuning via OpenAI API
License	Proprietary (OpenAI Terms of Use)
Deployment	Cloud-only — OpenAI API or Azure OpenAI Service. No self-hosted or on-prem option
Knowledge Cutoff	October 2023

Comparison Summary

Capability	Qwen3-4B-Instruct-2507	GPT-4o-mini
Code translation	Yes	Yes
Function / tool calling	Yes	Yes
JSON structured output	Yes	Yes
On-prem / air-gapped deployment	Yes	No
Data sovereignty	Full (weights run locally)	No (data sent to cloud API)
Open weights	Yes (Apache 2.0)	No (proprietary)
Custom fine-tuning	Full fine-tuning + LoRA adapters	Supervised fine-tuning (API only)
Quantization for edge devices	GGUF / AWQ / GPTQ / MLX	N/A
Multimodal (image input)	No	Yes
Native context window	256K	128K

Both models support code translation, function calling, and JSON-structured output. However, only Qwen3-4B offers open weights, data sovereignty, and local deployment flexibility — making it suitable for air-gapped, regulated, or cost-sensitive environments. GPT-4o-mini offers lower latency and higher throughput via OpenAI's cloud infrastructure, with added multimodal capabilities.

LLM Provider Configuration

All providers are configured via the .env file. Set INFERENCE_PROVIDER=remote for any cloud or API-based provider, and INFERENCE_PROVIDER=ollama for local inference.

OpenAI

INFERENCE_PROVIDER=remote
INFERENCE_API_ENDPOINT=https://api.openai.com
INFERENCE_API_TOKEN=sk-...
INFERENCE_MODEL_NAME=gpt-4o

Recommended models: gpt-4o, gpt-4o-mini, gpt-4-turbo.

Groq

Groq provides OpenAI-compatible endpoints with extremely fast inference (LPU hardware).

INFERENCE_PROVIDER=remote
INFERENCE_API_ENDPOINT=https://api.groq.com/openai
INFERENCE_API_TOKEN=gsk_...
INFERENCE_MODEL_NAME=llama3-70b-8192

Recommended models: llama3-70b-8192, mixtral-8x7b-32768, llama-3.1-8b-instant.

Ollama

Runs inference locally on the host machine with full GPU acceleration.

Install Ollama: https://ollama.com/download
Pull a model:

 # Production — best translation quality (~20 GB)
 ollama pull codellama:34b

 # Testing / SLM benchmarking (~4 GB, fast)
 ollama pull codellama:7b

 # Other strong code models
 ollama pull deepseek-coder:6.7b
 ollama pull qwen2.5-coder:7b
 ollama pull codellama:13b

Confirm Ollama is running:

 curl http://localhost:11434/api/tags

Configure .env:

 INFERENCE_PROVIDER=ollama
 INFERENCE_API_ENDPOINT=http://host.docker.internal:11434
 INFERENCE_MODEL_NAME=codellama:7b
 # INFERENCE_API_TOKEN is not required for Ollama

OpenRouter

OpenRouter provides a unified API across hundreds of models from different providers.

INFERENCE_PROVIDER=remote
INFERENCE_API_ENDPOINT=https://openrouter.ai/api
INFERENCE_API_TOKEN=sk-or-...
INFERENCE_MODEL_NAME=meta-llama/llama-3.1-70b-instruct

Recommended models: meta-llama/llama-3.1-70b-instruct, deepseek/deepseek-coder, qwen/qwen-2.5-coder-32b-instruct.

Custom OpenAI-Compatible API

Any enterprise gateway that exposes an OpenAI-compatible /v1/completions or /v1/chat/completions endpoint works without code changes.

GenAI Gateway (LiteLLM-backed):

INFERENCE_PROVIDER=remote
INFERENCE_API_ENDPOINT=https://genai-gateway.example.com
INFERENCE_API_TOKEN=your-litellm-master-key
INFERENCE_MODEL_NAME=codellama/CodeLlama-34b-Instruct-hf

If the endpoint uses a private domain mapped in /etc/hosts, also set:

LOCAL_URL_ENDPOINT=your-private-domain.internal

Switching Providers

Edit .env with the new provider's values.
Restart the backend container:

 docker compose restart transpiler-api

No rebuild is needed — all settings are injected at runtime via environment variables.

Environment Variables

All variables are defined in .env (copied from .env.example). The backend reads them at startup via python-dotenv.

Core LLM Configuration

Variable	Description	Default	Type
`INFERENCE_PROVIDER`	`remote` for any OpenAI-compatible API; `ollama` for local inference	`remote`	string
`INFERENCE_API_ENDPOINT`	Base URL of the inference service (no `/v1` suffix)	—	string
`INFERENCE_API_TOKEN`	Bearer token / API key. Not required for Ollama	—	string
`INFERENCE_MODEL_NAME`	Model identifier passed to the API	`codellama/CodeLlama-34b-Instruct-hf`	string

Generation Parameters

Variable	Description	Default	Type
`LLM_TEMPERATURE`	Sampling temperature. Lower = more deterministic output (0.0–2.0)	`0.2`	float
`LLM_MAX_TOKENS`	Maximum tokens in the translated output	`4096`	integer
`MAX_CODE_LENGTH`	Maximum input code length in characters	`4000`	integer

File Upload Limits

Variable	Description	Default	Type
`MAX_FILE_SIZE`	Maximum PDF upload size in bytes (default: 10 MB)	`10485760`	integer

Session Management

Variable	Description	Default	Type
`CORS_ALLOW_ORIGINS`	Allowed CORS origins (comma-separated or `*`). Restrict in production	`["*"]`	string

Server Configuration

Variable	Description	Default	Type
`BACKEND_PORT`	Port the FastAPI server listens on	`5001`	integer
`LOCAL_URL_ENDPOINT`	Private domain in `/etc/hosts` the container must resolve. Leave as `not-needed` if not applicable	`not-needed`	string
`VERIFY_SSL`	Set `false` only for environments with self-signed certificates	`true`	boolean

Technology Stack

Backend

Framework: FastAPI (Python 3.11+) with Uvicorn ASGI server
LLM Integration: openai Python SDK — works with any OpenAI-compatible endpoint (remote or Ollama)
Local Inference: Ollama — runs natively on host with full Metal (MPS) or CUDA GPU acceleration
PDF Processing: PyMuPDF (fitz) for text and code extraction from uploaded documents
Config Management: python-dotenv for environment variable injection at startup
Data Validation: Pydantic v2 for request/response schema enforcement

Frontend

Framework: React 18 with Vite (fast HMR and production bundler)
Styling: Tailwind CSS v3 with custom surface-* dark mode color palette
Production Server: Nginx — serves the built assets and proxies /api/ to the backend container
UI Features: Language pill selectors, side-by-side code editor, drag-and-drop PDF upload, real-time character counter, one-click copy, dark/light theme toggle

Troubleshooting

For common issues and solutions, see TROUBLESHOOTING.md.

Common Issues

Issue: Backend returns 500 on translate

# Check backend logs for error details
docker compose logs backend

# Verify the inference endpoint and token are set correctly
grep INFERENCE .env

Confirm INFERENCE_API_ENDPOINT is reachable from your machine.
Verify INFERENCE_API_TOKEN is valid and has the correct permissions.

Issue: Ollama connection refused

# Confirm Ollama is running on the host
curl http://localhost:11434/api/tags

# If not running, start it
ollama serve

Issue: Ollama is slow / appears to be CPU-only

Ensure Ollama is running natively on the host, not inside Docker.
On macOS, verify the Ollama app is using MPS in Activity Monitor (GPU History).
See the Ollama section for correct setup.

Issue: SSL certificate errors

# In .env
VERIFY_SSL=false

# Restart the backend
docker compose restart transpiler-api

Issue: PDF upload fails or returns no code

Max file size: 10 MB (MAX_FILE_SIZE)
Supported format: PDF only (text-based; scanned image PDFs are not supported)
Ensure the file is not corrupted or password-protected

Issue: Frontend cannot connect to API

# Verify both containers are running
docker compose ps

# Check CORS settings
grep CORS .env

Ensure CORS_ALLOW_ORIGINS includes the frontend origin (e.g., http://localhost:3000).

Issue: Private domain not resolving inside container

Set LOCAL_URL_ENDPOINT=your-private-domain.internal in .env — this adds the host-gateway mapping for the container.

Debug Mode

Enable verbose logging for deeper inspection:

# Not a built-in env var — increase FastAPI log level via Uvicorn
# Edit docker-compose.yaml command or run locally:
uvicorn server:app --reload --port 5001 --log-level debug

Or view real-time container logs:

docker compose logs -f transpiler-api

License

This project is licensed under our LICENSE file for details.

Disclaimer

CodeTrans is provided as-is for demonstration and educational purposes. While we strive for accuracy:

Translated code should be reviewed by a qualified engineer before use in production systems
Do not rely solely on AI-generated translations without testing and validation
Do not submit confidential or proprietary code to third-party API providers without reviewing their data handling policies
The quality of translation depends on the underlying model and may vary across language pairs and code complexity

For full disclaimer details, see DISCLAIMER.md.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github		.github
api		api
docs/assets		docs/assets
ui		ui
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
DISCLAIMER.md		DISCLAIMER.md
LICENSE.md		LICENSE.md
README.md		README.md
SECURITY.md		SECURITY.md
TERMS_AND_CONDITIONS.md		TERMS_AND_CONDITIONS.md
TROUBLESHOOTING.md		TROUBLESHOOTING.md
docker-compose.yaml		docker-compose.yaml

Folders and files

Latest commit

History

Repository files navigation

CodeTrans — AI-Powered Code Translation

Table of Contents

Project Overview

How It Works

Architecture

Architecture Diagram

Architecture Components

Service Components

Typical Flow

Get Started

Prerequisites

Verify Installation

Quick Start (Docker Deployment)

1. Clone the Repository

2. Configure the Environment

3. Build and Start the Application

4. Access the Application

5. Verify Services

6. Stop the Application

Local Development Setup

Project Structure

Usage Guide

Performance Tips

Inference Benchmarks

Model Capabilities

Qwen3-4B-Instruct-2507

GPT-4o-mini

Comparison Summary

LLM Provider Configuration

OpenAI

Groq

Ollama

OpenRouter

Custom OpenAI-Compatible API

Switching Providers

Environment Variables

Core LLM Configuration

Generation Parameters

File Upload Limits

Session Management

Server Configuration

Technology Stack

Backend

Frontend

Troubleshooting

Common Issues

Debug Mode

License

Disclaimer

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages