-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Aditya Agarwal edited this page Mar 13, 2026
·
1 revision
Welcome to the NanoChat Wiki. NanoChat is a lightweight, modern Android chat application built with Jetpack Compose and Material 3, designed to provide a unified interface for multiple AI inference backends.
- Project Vision
- Architecture Overview
- Inference Backends
- Data & Persistence
- Development Guide
- Roadmap
NanoChat's core philosophy is: Use the smallest, fastest model that can answer the question. It provides a single interface to interact with three distinct backends, switchable on the fly:
- Nano: Local Gemini Nano via Google AICore (zero cost, private).
- Model: Locally downloaded open-source LLMs (via MediaPipe).
- Remote: OpenAI-compatible cloud APIs (fallback for complexity).
Targeted at devices like the Pixel 9, NanoChat leverages on-device AI capabilities (AICore) to provide a "JARVIS-style" experience that works offline for everyday queries while offering cloud scaling when needed.
NanoChat follows a clean MVVM architecture with a repository pattern to abstract inference logic.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β ChatScreen ββ FilterChip: [Nano] [Model] [Remote] β
β β
β ChatViewModel β
β βββ ChatRepository β
β βββ InferenceClient (interface) β
β βββ LocalInferenceClient β AICore / Gemini Nanoβ
β βββ DownloadedModelClient β MediaPipe LLM β
β βββ RemoteInferenceClient β OkHttp SSE stream β
β β
β Room DB (ChatSession + ChatMessage + DownloadedModel) β
β DataStore (InferenceMode, Non-secret settings) β
β EncryptedPrefs (API Keys, Secure Tokens) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
-
InferenceClient: The primary contract for all backends.-
isAvailable(): Returns availability status. -
streamChat(history, prompt): Returns a Flow of string chunks.
-
-
Engine: Gemini Nano via
com.google.ai.edge.aicore. - Setup: Must be enabled in Developer Options β Gemini Nano.
- Characteristics: Foreground-only, subject to OS-level quotas, no cost, fully offline.
- Context: History is flattened into a single prompt (default window: last 10 turns).
- Engine: MediaPipe LLM Inference (LiteRT).
-
Storage:
.taskfiles stored in internal or external storage. - Catalog: Support for Qwen2.5, DeepSeek-R1, Phi-2, and Gemma.
-
Format: Uses ChatML (
<|user|>/<|assistant|>).
- Engine: OkHttp with SSE (Server-Sent Events).
- Compatibility: Any OpenAI-compatible endpoint.
- Features: Token-by-token streaming, configurable base URL, and model name.
-
ChatSession: Metadata for conversations (ID, title, timestamp). -
ChatMessage: Individual messages (role, content, timestamp, session ID). -
DownloadedModel: Tracking local model files and download progress.
To maintain security, settings are split:
-
DataStore: Non-sensitive data like
InferenceMode,baseUrl, andmodelName. -
EncryptedSharedPreferences: Sensitive secrets like
apiKeyandhuggingFaceToken.
- UI: Jetpack Compose, Material 3 (Expressive)
- Language: Kotlin 2.1.10
- Build: Gradle 9.3.1 / AGP 9.1.0
- Minimum SDK: 31 (Android 12)
- Open in Android Studio Ladybug or newer.
- Ensure JDK 17 is configured.
- Use the Gradle wrapper:
- POSIX:
./gradlew assembleDebug - Windows:
.\gradlew.bat assembleDebug
- POSIX:
- 3-tab shell: Chat, Models (Stub), Settings.
- Room persistence for sessions and messages.
- Remote backend with SSE streaming.
- AICore (Nano) integration.
- Secure storage for API keys.
- MediaPipe LLM Inference integration.
- Model catalog and download manager.
- Storage management (Move to SD card).
- Enhanced error handling and retry logic.
- Voice input/output.
- Image/Multimodal support (Gemini Nano with Multimodality).
- UI Polish & Animations.