Local-first AI assistant with mode-based routing, tool calling, voice control, and a Flask web UI.
Important
NeonVoice-Core powers NeonAI with scored routing across tools, web lookup, local LLM flows, and voice assistant features.
| Area | Highlights |
|---|---|
| Core AI | scored routing, confidence gate, local-first pipeline |
| Modes | casual, coding, movie, exam, voice assistant |
| Tools | weather, notes, calculator, browser, music, web reader |
| Stack | Flask, Ollama, Whisper, ChromaDB, GPT-SoVITS |
Tip
Add your next two screenshots later by replacing the placeholder URLs below.
Two-screenshot showcase block
<table>
<tr>
<td width="50%" align="center">
<strong>Main Chat UI</strong>
<br><br>
<img src="PASTE_SCREENSHOT_1_URL_HERE" width="100%" alt="NeonVoice-Core screenshot 1" />
</td>
<td width="50%" align="center">
<strong>Voice Assistant / Mode View</strong>
<br><br>
<img src="PASTE_SCREENSHOT_2_URL_HERE" width="100%" alt="NeonVoice-Core screenshot 2" />
</td>
</tr>
</table>NeonVoice-Core powers NeonAI and routes requests through a small local AI system instead of sending everything straight to a single model.
- direct tools for weather, notes, calculator, browser actions, system info, and music
- scored routing between system commands, tools, web lookup, and local LLM generation
- separate modes for chat, coding, movies, exam/RAG, and voice assistant behavior
- local-first operation with Ollama, Whisper, and optional GPT-SoVITS
casual: general chat with tool-first routing and optional web lookupcoding: coding-focused responses using a dedicated code modelmovie: TMDB-powered movie cards, recommendations, and summariesexam: PDF-only retrieval mode backed by ChromaDBvoice_assistant: speech input, command routing, and TTS output
- weather
- calculator and conversions
- system information
- notes
- browser and search control
- webpage reader
- music lookup and YouTube handoff
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
Copy-Item .env.example .env
python server.pyRequired local services:
- Install Ollama
- Pull
llama3.2:3b - Pull
qwen2.5-coder
Optional voice setup:
- install GPT-SoVITS
- set
GPT_SOVITS_DIRor explicit GPT-SoVITS model paths in.env
Open http://localhost:5000
This repo intentionally keeps local-only files out of GitHub:
.envand private tokensuser_data/databases, uploads, notes, and runtime files- local embedding cache files under
models/embeddings/ - temp audio, test artifacts, and editor caches
If the local embedding cache is missing, NeonAI now falls back to the official sentence-transformers/all-MiniLM-L6-v2 model name so clones do not need your local cache committed to GitHub.
python -m pytest -qOpen the original project write-up
NeonAI V5 is a fully local AI system with mode-driven intelligence, tool calling, voice assistant, and a premium UI โ running entirely on your machine.
โ ๏ธ This is not a chatbot wrapper.
NeonAI is an AI system โ with modes, rules, confidence gates, tool calling, memory, voice control, and decision pipelines. The LLM is a component, not the decision-maker.
| Principle | Description |
|---|---|
| ๐ง System > Model | AI logic governs the LLM, not the other way around |
| ๐ Privacy First | Everything runs locally โ your data never leaves your machine |
| ๐ฏ Mode-Driven | Each mode has its own rules, memory, tools, and constraints |
| ๐งญ Scored Router | Deterministic routing (system/tool/web/LLM) with confidence + guards |
| ๐งฉ Clarification Layer | N-best routing asks when top-2 intents are close |
| ๐ง Context Memory Router | Follow-ups like โpauseโ resolve correctly using recent context |
| ๐ ๏ธ Tool Calling | Real tools (weather, calculator, browser, notes, music) โ instant, no LLM needed |
| ๐ค Voice Control | Full voice assistant with system commands, TTS, and tool access |
| ๐ Secure | Session-isolated users, hashed passwords, safe math eval, auth-guarded endpoints |
| ๐งช Experimental | Built to explore controlled AI design, not to be a product |
| ๐ค NEON AI | General chat with smart web search + local LLM hybrid. Calculator & weather tools built-in. |
| ๐ป NEON CODE | Copy-paste ready code generation. Auto-switches from casual on coding intent. |
| ๐ฌ NEON MOVIES | Trending carousel, movie details with genres/director/trailer/recommendations via TMDB. |
| ๐ NEON STUDY | PDF-based RAG pipeline. Internet blocked. If the answer isn't in the PDF โ AI refuses. |
| ๐ค VOICE ASSISTANT | Full voice control โ talk to Neon, use tools, control your PC. 20+ command types. |
๐ก Each mode has isolated chat history โ switching modes keeps each mode's conversation separate.
NeonAI has built-in tools that respond instantly without waiting for the LLM. Tool routing uses a Semantic Router (SentenceTransformers) for natural-language intent matching + actionability gates to reduce false triggers.
Tool calls also return structured data for the UI:
{
"type": "tool",
"tool": "weather",
"action": "get_weather",
"args": { "city": "Delhi" }
}| Tool | Trigger Examples | Available In |
|---|---|---|
| ๐ค๏ธ Weather | "Weather in Delhi", "Temperature in New York" | Chat + Voice |
| ๐งฎ Calculator | "Calculate 25 ร 4 + 10", "Convert 100 km to miles" | Chat + Voice |
| ๐ป System Info | "Battery level", "RAM usage", "CPU status" | Chat + Voice |
| ๐ Notes | "Save note: buy groceries", "Show my notes" | Chat + Voice |
| ๐ Web Reader | "Read this https://example.com" | Chat + Voice |
| ๐ต Music | "Top 10 songs", "Play Drake", "Recommend some hip-hop" | Chat + Voice |
| ๐ Browser | "Search on YouTube", "Google machine learning" | Chat + Voice |
User: "Weather in Delhi"
โ Semantic Router detects intent โ weather tool
โ Instant response: ๐ค๏ธ 28ยฐC, Partly Cloudy
โ No LLM call needed (< 1 second)
Talk to Neon using Whisper (STT) + Llama 3.2 (brain) + GPT-SoVITS (TTS).
| Category | Examples |
| ๐ฅ๏ธ Apps | "Open Chrome", "Launch Spotify", "Open VS Code" |
| ๐ Web | "Open YouTube", "Go to GitHub" |
| ๐ Search | "Search Python tutorials", "Google the news" |
| "Play lofi music on YouTube" | |
| ๐ต Media | "Pause", "Next song", "Stop music" |
| ๐ Volume | "Volume up", "Set volume to 50", "Mute" |
| ๐ก Brightness | "Increase brightness", "Set brightness to 70" |
| ๐ถ Connectivity | "Turn on Bluetooth", "WiFi off", "Airplane mode" |
| โก System | "Shutdown", "Restart", "Lock screen", "Sleep" |
| ๐ค๏ธ Tools | "What's the weather?", "System info", "Save a note" |
graph TD;
Text_Query-->intent_score_router;
Voice_Audio-->whisper_engine_STT;
intent_score_router-->Is_it_a_System_or_Tool_or_Web;
whisper_engine_STT-->Is_it_a_Command;
Is_it_a_System_or_Tool_or_Web-- SYSTEM -->system_command_executor;
Is_it_a_System_or_Tool_or_Web-- TOOL -->Execute_Local_Tool;
Is_it_a_System_or_Tool_or_Web-- WEB -->search_adapter;
Is_it_a_Command-- YES -->command_router_OS_Actions;
Is_it_a_System_or_Tool_or_Web-- LLM -->waterfall;
Is_it_a_Command-- NO -->waterfall;
waterfall-->Need_Web_Search;
Need_Web_Search-- YES -->search_adapter;
Need_Web_Search-- NO -->Local_LLM_Llama3_Qwen;
search_adapter-->Local_LLM_Llama3_Qwen;
Local_LLM_Llama3_Qwen-->confidence_gate;
confidence_gate-->Pass_Threshold;
Pass_Threshold-- NO -->Block_Regenerate;
Pass_Threshold-- YES -->Return_Text_TTS_GPT_SoVITS;
- ๐ Animated Splash Screen โ Spinning ring, progress bar, "NEON AI" reveal on startup
- ๐จ 15+ Neon Themes + Light/Dark mode with physics-based liquid toggle
- ๐ฌ Rich Message Rendering โ Bold, headers, numbered lists as glass cards, rating badges
- ๐ Confidence Scoring Badges โ AI self-evaluates (0-100%) and displays a confidence metric badge under every answer
- ๐ฅ Voice Customization โ Upload your own looping background video for the Voice UI panel
- ๐ผ๏ธ Wallpaper Upload (Image/Video) โ Drag & drop + progress bar + remove button
- ๐ฆ Upload Limits โ Background video up to 50MB, image up to 10MB
- ๐ต Music Cards โ Rich, clickable YouTube-linked gradient cards natively rendered in chat
- ๐ Code Blocks โ Syntax highlighted with copy-to-clipboard button
- ๐ Web Source Icons โ Favicon pills show which websites sourced the answer
- ๐ฌ Movie Detail Cards โ Genre tags, director, runtime, trailer button, recommendation carousel
- ๐๏ธ Draggable Voice Button โ GSAP Draggable, saves position
- ๐ฑ Fully Responsive โ Desktop + Mobile
NeonAI/
โ
โโโ server.py # Flask backend + API routing + /health endpoint
โโโ START_NEON.bat # One-click launcher (Windows)
โโโ .env # Environment config (NEON_SECRET, NGROK_TOKEN)
โ
โโโ brain/ # Core AI logic
โ โโโ waterfall.py # Decision flow & smart routing
โ โโโ intent_score_router.py # Deterministic scored routing + N-best clarification
โ โโโ router_state.py # Per-user clarification + context memory state
โ โโโ confidence_gate.py # Hallucination control (0-100%)
โ โโโ gk_engine.py # General knowledge evaluation
โ โโโ memory.py # Session & preference memory
โ
โโโ models/ # LLM layer
โ โโโ local_llm.py # Llama 3.2 (chat) + Qwen 2.5 (coding) via Ollama
โ โโโ hybrid_llm.py # Web + LLM fusion
โ โโโ assistant_llm.py # Llama 3.2 (voice) via Ollama
โ
โโโ tools/ # Tool Calling System (Semantic Router)
โ โโโ tool_router.py # SentenceTransformer intent detection
โ โโโ weather.py # Weather via Open-Meteo (free, no key)
โ โโโ calculator.py # Safe AST math + unit conversions
โ โโโ system_info.py # CPU/RAM/disk/battery/GPU
โ โโโ notes.py # Thread-safe CRUD notes (JSON)
โ โโโ music.py # YouTube Music search + curated lists
โ โโโ web_reader.py # Fetch & summarize URLs
โ โโโ browser_control.py # Google/YouTube/URL opener
โ โโโ vision_offline.py # Offline resume/image analysis via Ollama vision + PDF extraction
โ
โโโ voice/ # Voice Assistant
โ โโโ whisper_engine.py # Speech-to-text (Whisper)
โ โโโ tts_engine.py # Text-to-speech (GPT-SoVITS) โ env-configurable
โ โโโ command_router.py # Semantic NLP โ action routing (per-user state)
โ โโโ llm_command_executor.py # System command execution (volume, apps, etc.)
โ โโโ model_loader.py # Voice model management โ env-configurable
โ โโโ reference_loader.py # TTS reference audio โ env-configurable
โ
โโโ exam/ # NEON STUDY (PDF RAG)
โ โโโ indexer.py # PDF โ ChromaDB vector indexing
โ โโโ retriever.py # Strict PDF-only retrieval
โ
โโโ web/ # Web adapters
โ โโโ search_adapter.py # Tavily / DuckDuckGo
โ โโโ movie_adapter.py # TMDB (genres, trailer, recs)
โ
โโโ utils/ # Utilities
โ โโโ auth_db.py # SQLite auth (hashed passwords, try/finally)
โ โโโ movie_db.py # Movie cache (SQLite, try/finally)
โ โโโ network.py # Internet policy & connectivity check
โ โโโ storage_paths.py # Centralized path management
โ
โโโ scripts/ # Dev tools
โ โโโ command_tester.py # Test command routing
โ โโโ edge_case_tester.py # Test edge cases
โ โโโ generate_flow.py # Generate architecture diagram
โ โโโ movie_updater.py # Batch movie cache updates
โ โโโ add_one_line_headers.py # Bulk-add one-line file purpose headers
โ
โโโ tests/
โ โโโ test_routing.py # Pytest test suite
โ โโโ test_false_triggers.py # Regression tests for routing false triggers
โ
โโโ templates/
โ โโโ index.html # Main chat UI
โ โโโ login.html # Authentication page
โ
โโโ static/
โโโ app.js # Frontend logic (1500+ lines)
โโโ styles.css # Premium styling (2500+ lines)
โโโ wallpapers/ # Custom backgrounds
Software:
- Python 3.10+
- Ollama installed and running
- Models:
ollama pull llama3.2:3b+ollama pull qwen2.5-coder - (Optional) GPT-SoVITS for voice TTS
Hardware:
- CPU: Multi-core processor (Intel i5/Ryzen 5 or better)
- RAM: Minimum 8GB (16GB recommended)
- GPU (Optional): NVIDIA GPU with 6GB+ VRAM for Whisper & GPT-SoVITS acceleration
- Storage: Minimum 10GB free (SSD preferred)
pip install -r requirements.txt
python server.pyOr double-click START_NEON.bat
Open: http://localhost:5000
Visit http://localhost:5000/health to verify system status (Ollama, TTS, Internet).
- TMDB โ Movie posters, details, recommendations
- Tavily โ Higher quality web search (free tier available)
| Variable | Purpose | Required |
|---|---|---|
NEON_SECRET |
Flask session signing key | โ (auto-generated default) |
NGROK_TOKEN |
ngrok tunnel for remote access | Optional |
TTS_REF_AUDIO |
Custom TTS reference audio path | Optional |
GPT_SOVITS_GPT_MODEL |
GPT-SoVITS GPT model path | Optional |
GPT_SOVITS_SOVITS_MODEL |
GPT-SoVITS SoVITS model path | Optional |
- โ
No
eval()โ Math uses safe AST-based evaluation - โ Hashed passwords โ PBKDF2 via Werkzeug
- โ Auth-guarded endpoints โ All write/reset routes require login
- โ Session rotation โ Regenerated on login to prevent fixation
- โ HTTPOnly cookies โ Session cookies not accessible via JavaScript
- โ CORS locked โ Only localhost origins accepted
- โ Per-user isolation โ Separate history, notes, media, and pending commands
Run these from the project root:
python -m compileall -q .
python -m pytest -q- โ Multi-mode AI system with isolated history
- โ Semantic Router tool calling (weather, calculator, notes, system, browser, music, web reader)
- โ Deterministic scored routing (system/tool/web/LLM) + N-best clarification + context memory
- โ
Structured tool outputs (
tool_data) for UI/voice - โ Voice assistant with 20+ command types and Smart Browser Control
- โ Premium UI with splash screen, 15+ themes, animations, microinteractions
- โ Confidence Gate scoring (0-100% evaluation metric)
- โ Smart web search + local LLM hybrid
- โ Movie mode with trailer, genres, recommendations
- โ Code blocks syntax highlighted with copy-to-clipboard button
- โ Rich markdown rendering (lists, headers, ratings)
- โ Ollama lazy reconnect (auto-recovers if started late)
- โ Thread-safe notes and SQLite connection management
โ ๏ธ Experimental โ Architecture locked for iteration
- Vision (Realtime camera): Webcam/screenshot analysis
- Long-Term Vector Memory: Cross-session preference/knowledge memory
- Autonomous Agents: Chained multi-tool workflows (search โ summarize โ save to notes)
This is an experimental project built for learning, research, and AI system design exploration. Not a commercial product.
