🚀 Define Workflows in Natural Language — Your Zero-Code AI File Expert
📄 Full-Format Docs
🧠 Natural-Language SOPs
🔌 One-Click Tools
🛡️ Rock-Solid Stability
⚡ Token-Smart
English | 中文
Pixelle Studio is an open-source AI Agent workspace built for expert-level file processing. Simply describe your workflow in natural language to turn daily task SOPs into AI-executable Skills — no complex variable passing or coding knowledge required. Plug in external tools via MCP, and let the Agent reliably generate PDFs, PPTs, Excel files, and more — with three-layer failover for rock-solid stability and progressive loading to save tokens.
💡 Not just a chatbot — teach AI your workflow in plain language, zero-code your way to an expert file processing agent.
|
PDF, Excel, PPT, Word, Markdown, HTML |
Describe workflows in plain language, zero-code |
Plug in search, maps, video & more via MCP |
|
Three-layer failover + WebSocket heartbeat |
Progressive skill loading saves 90%+ tokens |
Built-in PTY terminal with smart security filtering |
| Feature | Traditional AI Chat Tools | Pixelle Studio |
|---|---|---|
| Document Generation (PDF/PPT/Excel) | ❌ Text-only output | ✅ Generate files with live preview |
| Code Execution | ❌ Or plugin-dependent | ✅ Built-in persistent terminal, multi-step |
| Custom Skills | ❌ | ✅ Natural language to define Skills, zero-code |
| External Tools (MCP) | ❌ Closed ecosystem | ✅ Open protocol, plug & play |
| Context Management | ❌ Passive truncation | ✅ Smart compression with rich history reconstruction |
| Multi-Model Failover | ❌ Single model | ✅ Three-layer failover + WebSocket heartbeat |
| Token Consumption | 🔴 Full context loading | 🟢 Progressive on-demand loading |
💬 "I'd like to drive from Seattle Airport to Mount Rainier, could you please help me with my itinerary? I'm leaving tomorrow morning."
Agent loads map skills → calls map API → plans the route → generates itinerary document with live preview
Fully bilingual — the same natural language experience works seamlessly in Chinese too 👇
💬 "Design a professional PPT presentation about "2025 AI Technology Trends" with a cover page, table of contents, 3 content slides with charts, and a summary page. Use modern, clean design with professional color scheme."
Agent combines multiple Skills → web search → content scraping → structured analysis → auto-generates PPT
💬 "I have a yearly sales dataset. Summarize revenue by product line per quarter, calculate YoY growth rates, highlight anomalies in red, and generate a financial analysis report."
Agent reads raw Excel → data cleaning & structuring → uses native Excel formulas (SUM/VLOOKUP/growth rate formulas, not Python-hardcoded values) for summaries → conditional formatting to flag anomalies → recalc.py verifies zero formula errors → outputs a professional financial report
Why more accurate? Traditional AI tools calculate numbers in Python and paste them into cells — change the data, and everything breaks. Pixelle Studio insists on native Excel formula-driven output, producing "living" spreadsheets — edit the source data, and all summaries, growth rates, and charts update automatically.
💬 "Build me a Snake game"
Agent writes HTML/CSS/JS → generates a runnable game file → built-in preview for instant play
Don't just use built-in skills — create your own to teach the Agent your unique workflows:
┌──────────────────────────┐
│ Frontend (Next.js) │
│ Chat + Skills + Preview │
└────────────┬─────────────┘
│ WebSocket
┌────────────▼─────────────┐
│ Backend (FastAPI) │
│ SkillAgent Core │
└────────────┬─────────────┘
│
┌────────────┬────────────┼────────────┬────────────┐
▼ ▼ ▼ ▼ ▼
┌─────────────┐┌──────────┐┌───────────┐┌───────────┐┌──────────┐
│ 9 Built-in ││ PTY ││ Skill ││ MCP ││ Context │
│ Tools ││ Terminal ││ System ││ Tools ││ Manager │
│ read/write ││Persistent││Progressive││ External ││ Auto │
│ exec/shell ││ Sessions ││ Loading ││Integration││Compaction│
└─────────────┘└──────────┘└───────────┘└───────────┘└──────────┘
🔋 Progressive Skill Loading — The Secret to Token Efficiency
Unlike traditional approaches that stuff all skills into the System Prompt, we use three-level progressive loading:
| Level | Content | Token Cost | When Loaded |
|---|---|---|---|
| Level 1 | Skill metadata (name + description) | ~50 tokens/skill | Every conversation |
| Level 2 | Full SKILL.md documentation | ~500-2000 tokens | On-demand by Agent |
| Level 3 | Auxiliary files (scripts/references) | Variable | On-demand by Agent |
Result: 12 built-in Skills consume only ~600 tokens of metadata, while traditional approaches might require 20,000+ tokens.
🖥️ Persistent Pseudo-Terminal (PTY) — Beyond Code Execution
Built on pexpect, our persistent Shell Sessions go beyond one-shot code execution:
# Variables persist across multiple calls!
shell_exec("import pandas as pd", shell_type="python")
shell_exec("df = pd.DataFrame({'a': [1,2,3]})", shell_type="python")
shell_exec("print(df.describe())", shell_type="python") # df still exists!Advantages:
- ✅ Variable Persistence — State maintained across calls
- ✅ Multi-language — Bash / Python / IPython
- ✅ Smart Security — Blocks dangerous commands while allowing legitimate patterns (e.g.
python -c "stmt1; stmt2") - ✅ Auto-recovery — Automatic restart on session crash
- ✅ Auto-cleanup — Idle sessions automatically recycled
🛡️ Three-Layer Failover + WebSocket Heartbeat — Service That Never Stops
Request failed?
├─ Layer 1: Auth Failover → Switch API Key / Base URL
├─ Layer 2: Model Failover → Switch to fallback model (gpt-4o → gpt-4o-mini → ...)
└─ Layer 3: Thinking Failover → Downgrade thinking depth
Long-running task?
└─ WebSocket Heartbeat → Periodic pings keep the connection alive
Even if the primary model faces rate limits, timeouts, or quota exhaustion, the system automatically switches to backup plans. For long-running tasks like PPT generation, WebSocket heartbeat keeps the connection alive — no more "no signal" drops.
📐 Smart Context Management — Never Overflow
- Context Window Guard — Real-time token usage monitoring with automatic threshold alerts
- Auto-Compaction — When context reaches ~70% usage, automatically generates a summary to compress history
- Rich History Reconstruction — Multi-turn conversations retain tool calls, code execution, and file outputs for coherent context
- Multi-model Aware — Auto-detects model context window sizes (GPT-4o 128K / Claude 200K / Gemini 1M)
Seamlessly connect external tools via the Model Context Protocol open standard:
Built-in Skills already support these MCP tools:
| Tool | Function | Use Case |
|---|---|---|
| 🔍 Exa Search | AI-native search engine | Deep research, info gathering |
| 🔍 Bing Search | General web search | Real-time information queries |
| 🗺️ AMap (Gaode) | Route planning, POI search | Travel planning |
| 🌐 Web Fetch | Web content scraping | Data collection |
| 🎬 Social Media Video | Video content parsing | Content creation |
You can also integrate any MCP-compatible tool service!
- Python 3.10+ and uv (Python package manager)
- Node.js 20+ and npm
- Docker & Docker Compose (optional, for containerized deployment)
# 1. Clone the repo
git clone https://github.com/AIDC-AI/Pixelle-Studio.git
cd Pixelle-Studio
# 2. One-command start (auto-installs dependencies on first run)
./start.sh💡 After starting, open http://localhost:3000, click the ⚙️ Settings button in the top-right corner to configure your API Key, Base URL, and Model.
# 1. Clone the repo
git clone https://github.com/AIDC-AI/Pixelle-Studio.git
cd Pixelle-Studio
# 2. Build and start all services
docker compose up -d
# 3. View logs (optional)
docker compose logs -fThen visit 👉 http://localhost:3000 and configure your API Key in ⚙️ Settings.
📦 Docker Compose Commands Reference
docker compose up -d # Start all services in background
docker compose up # Start in foreground (see logs directly)
docker compose down # Stop all services
docker compose logs -f # Follow all logs
docker compose logs -f backend # Follow backend logs only
docker compose up --build # Rebuild images and start
docker compose ps # Show running services statusData Persistence: The following data is persisted through Docker volumes:
backend-data— SQLite databasebackend-scripts— Generated files (PDF/PPT/Excel/HTML etc.)backend-skills— User-defined skillsbackend-logs— Application logs
To reset all data: docker compose down -v
# Backend
cd backend
uv sync # Install Python dependencies (creates .venv automatically)
npm install # Install Node.js dependencies (for PPT/document generation skills)
.venv/bin/python3 -m uvicorn app.main:app --reload --host 0.0.0.0 --port 8001
# Frontend (new terminal)
cd frontend
npm install # Install Node.js dependencies
npm run dev # Start dev server (port 3000)Then visit 👉 http://localhost:3000 and configure your API Key in ⚙️ Settings.
LLM Settings (API Key, Base URL, Model) are configured per-user through the ⚙️ Settings panel in the web UI — no environment files needed.
Infrastructure variables (only needed for Docker or custom deployments):
| Variable | Description | Default |
|---|---|---|
FRONTEND_PORT |
Frontend port | 3000 |
BACKEND_PORT |
Backend port | 8001 |
NEXT_PUBLIC_API_BASE |
Frontend → Backend API URL | http://localhost:8001/api |
NEXT_PUBLIC_WS_BASE |
Frontend → Backend WebSocket URL | ws://localhost:8001/ws |
JWT_SECRET |
JWT signing secret | Auto-generated |
Pixelle Studio provides the Agent with 9 ready-to-use tools:
| Tool | Function | Description |
|---|---|---|
shell_exec |
Persistent terminal | Multi-step execution with variable persistence & smart security |
exec |
Command execution | One-shot commands / background tasks |
read_file |
Read files | Supports skill files and user files |
write_file |
Write files | Create scripts / config files |
edit_file |
Edit files | Precise string replacement |
grep |
Search content | Regex support |
find |
Find files | Glob pattern matching |
ls |
List directory | Smart limiting to prevent overflow |
process |
Process management | Monitor / terminate background tasks |
Pixelle-Studio/
├── frontend/ # Frontend (Next.js 16 + React 19)
│ ├── app/ # App Router pages
│ ├── components/ # UI Components
│ │ ├── layout/chat/ # Chat interface
│ │ ├── layout/leftPanel/ # Sidebar (Sessions + Skills)
│ │ └── ui/ # Shared UI components
│ ├── hooks/ # React Hooks
│ ├── lib/ # API clients
│ ├── types/ # TypeScript type definitions
│ └── Dockerfile # Frontend container image
│
├── backend/ # Backend (Python + FastAPI)
│ ├── app/
│ │ ├── agent.py # SkillAgent core engine
│ │ ├── tools/ # 9 built-in tools
│ │ ├── context/ # Context management (Guard + Compaction)
│ │ ├── skills/ # Skill loader
│ │ ├── config/ # Failover configuration
│ │ └── routes/ # REST API routes
│ ├── skills/ # Skills library
│ │ ├── default/ # Built-in skills (PDF/PPT/Excel/Search...)
│ │ └── <user_id>/ # User-defined skills
│ ├── scripts/ # Generated file storage
│ └── Dockerfile # Backend container image
│
├── docker-compose.yml # Docker Compose orchestration
├── assets/ # README assets
└── start.sh # One-command start script
Backend: Python 3.10+ · FastAPI · OpenAI API · WebSocket · SQLAlchemy · pexpect
Frontend: Next.js 16 · React 19 · TypeScript · Tailwind CSS
Infrastructure: SQLite · MCP Protocol · Docker Compose
We welcome all contributions! Whether it's bug reports, feature suggestions, or code submissions.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the Apache License 2.0.
⭐ If this project helps you, please give us a Star!