GitHub - wadealexc/kitsu: frontend/backend chat application on top of llama.cpp

kitsu

kitsu is a self-hosted chat application built on top of llama.cpp. It provides a web frontend and a backend API that manages llama.cpp processes, handles authentication, and runs tool calls.

Features:

Model management (sleep mode and wake-on-keystroke) — define multiple models in config.json; kitsu serves them on demand and swaps between them automatically. Idle models are shut down after a configurable timeout to save system resources. When a user starts typing on the frontend, the backend begins waking up the selected model.
Web search and page loading — uses Brave search + Playwright to give models access to the web
File uploads — supports PDF, JSON, text, and image uploads in chat

Important

This is a hobby project, use at your own risk. This codebase makes a lot of assumptions that pertain to my personal setup:

OS: Ubuntu Server 24
GPU: NVIDIA GeForce RTX 5090
Preferred models: Qwen3.5-27B, Qwen3.5-35B-A3B

Setup

Clone, Install Dependencies, Build Llama.cpp

git clone --recurse-submodules https://github.com/wadealexc/kitsu
cd kitsu

# Install deps
npm run install:all

# Build llama.cpp
cd llama.cpp
cmake -B build -DGGML_CUDA=ON
cmake --build build --config Release

cd ..

See llama.cpp build docs for other configurations.

Configure

cp config.example.json config.json

Edit config.json — at minimum, set your model paths and ports. See Configuration below.

TODO: add a script to automate creation of config.json from config.example.json

Add models

Place GGUF files in the models/ directory. Subdirectories are supported:

models/
  my-model/
    model-q6.gguf
    mmproj.gguf     # optional, for vision models

Then reference them in config.json under models.models.

TODO: add a script that auto-updates config.json as models are added to models

Run (Quick Start)

npm run dev             # backend
npm run dev:frontend    # frontend

Run (Prod)

Build the backend and install the systemd service:

npm run build
./install-service.sh
sudo systemctl enable --now kitsu

Start the frontend:

docker compose up -d

The frontend is available at http://localhost:5050.

Useful service commands:

systemctl status kitsu
journalctl -u kitsu -f -o cat
sudo systemctl stop kitsu
sudo systemctl restart kitsu

Configuration

kitsu is configured via config.json at the repo root. Copy config.example.json as a starting point.

models — define models and the models directory:

"models": {
    "path": "./models",
    "onStart": "my-model",
    "models": [
        {
            "gguf": "my-model/model-q6",
            "alias": "my-model",
            "args": ["--ctx-size", "32768"]
        }
    ]
}

ports — configure host/port for llama-server and the backend:

"ports": {
    "llamaCpp": { "port": 8070, "host": "0.0.0.0" },
    "backend":  { "port": 8071, "host": "0.0.0.0" }
}

llamaCpp — sleep timeout:

"llamaCpp": {
    "sleepAfterXSeconds": 600
}

Web

web — enable web search and page loading (requires a Brave API key):

"web": {
    "enable": true,
    "braveAPIKey": "YOUR_API_KEY",
    "runDangerouslyWithoutSandbox": false
}

If you want to try this feature out quickly to see if setting up the sandbox is "worth it," you can edit config to run without the sandbox (runDangerouslyWithoutSandbox: true), and it should 'just work.' But keep in mind that this will tell playwright to start chromium with --no-sandbox, which allows webpages to run arbitrary javascript on your system. If you do this, just use it for a quick trial run!

If you want to use this feature safely: this part will vary based on your OS and may require some troubleshooting if things don't work perfectly. I'm just going to detail the steps I took - sorry this isn't super user friendly!

Since I'm on Ubuntu, I followed these steps in the puppeteer docs (troubleshooting/#using-setuid-sandbox). They work for playwright as well:

# Find playwright's chromium version
cd ~/.cache/ms-playwright && ls
# You should see a folder titled `chromium-<some version number>` (for example `chromium-1208`)
cd chromium-1208/chrome-linux64
sudo chown root:root chrome_sandbox
sudo chmod 4755 chrome_sandbox
sudo cp -p chrome_sandbox /usr/local/sbin/chrome-devel-sandbox

Name		Name	Last commit message	Last commit date
Latest commit History 182 Commits
backend		backend
frontend		frontend
llama.cpp @ 98ae0a0		llama.cpp @ 98ae0a0
models		models
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
config.example.json		config.example.json
docker-compose.yml		docker-compose.yml
install-service.sh		install-service.sh
kitsu.service		kitsu.service
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kitsu

Important

Setup

Clone, Install Dependencies, Build Llama.cpp

Configure

Add models

Run (Quick Start)

Run (Prod)

Configuration

Web

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

kitsu

Important

Setup

Clone, Install Dependencies, Build Llama.cpp

Configure

Add models

Run (Quick Start)

Run (Prod)

Configuration

Web

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages