Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
6391429
Add Nim CLI port with curl-based resumable model downloads
corv89 Apr 5, 2026
b925468
Add .gitignore for build/ and BUILD.md with build instructions
corv89 Apr 5, 2026
b892790
Fix SHA256 checksum mismatch: normalize hex to lowercase
corv89 Apr 5, 2026
e4988cd
Fix nimble build: use modern task syntax and add nimcrypto dependency
corv89 Apr 5, 2026
b0ca0a5
Fix bookmark import: use re.find instead of re.match for substring regex
corv89 Apr 5, 2026
ac7bafb
Enable SSL support in build tasks and nim.cfg
corv89 Apr 5, 2026
802b05f
Switch runtime from llama-server to ollama
corv89 Apr 5, 2026
f15f488
Remove SSL dependency: all traffic is localhost to ollama
corv89 Apr 5, 2026
853c92a
Fix string interpolation, URL concat, and remove ollama lifecycle man…
corv89 Apr 5, 2026
5c03824
Handle qwen3.5 thinking mode and improve JSON extraction
corv89 Apr 5, 2026
5428cb9
Add verbose request logging and set qwen3.5-2b as default model
corv89 Apr 5, 2026
c859b62
Fix undoLastBatch clearing all bookmarks and remove unused variables
corv89 Apr 5, 2026
1b6cef6
Improve small model (0.8b) JSON schema adherence with lightweight con…
corv89 Apr 5, 2026
c01bf72
Add parallel LLM requests (concurrency=4) and increase default batch …
corv89 Apr 5, 2026
d4b03ca
Remove redundant deps install step and unnecessary installDirs from n…
corv89 Apr 5, 2026
8adc5d9
Fix 9 bugs, remove dead code, and extract shared utilities across 9 f…
corv89 Apr 5, 2026
aa12b49
Add CI and release workflows for Linux and macOS (x86_64 + arm64)
corv89 Apr 5, 2026
d2a080f
Add dedup and check-links subcommands for bookmark cleanup
corv89 Apr 5, 2026
fe4e6e7
Fix loadConfig ignoring config.toml written by model-set
corv89 Apr 6, 2026
83ac103
Switch to Ollama native /api/chat endpoint with constrained decoding
corv89 Apr 6, 2026
c5ae8c4
Add export subcommand for Netscape HTML bookmark output
corv89 Apr 6, 2026
6b51c4f
Add AGENTS.md with Nim dev guide, pipeline docs, and hard-won pitfalls
corv89 Apr 6, 2026
c284c93
Add curl to runtime dependencies in BUILD.md
corv89 Apr 6, 2026
d4f5fbe
Remove dead small-model schema procs and consolidate chat response pa…
corv89 Apr 6, 2026
19d6dd1
Fix progress bar stall by adding poll() calls in concurrent classific…
corv89 Apr 6, 2026
9700100
Add resume support: cluster cache and per-batch auto-commit in Phase 2
corv89 Apr 6, 2026
3a32076
Fix Phase 2 deadlock, add progress indicator, and handle Ctrl+C cleanup
corv89 Apr 7, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 52 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
name: CI

on:
push:
branches: [main]
pull_request:
branches: [main]

jobs:
build:
name: ${{ matrix.os }}-${{ matrix.arch }}
runs-on: ${{ matrix.runner }}
strategy:
fail-fast: false
matrix:
include:
- os: linux
arch: x86_64
runner: ubuntu-24.04
- os: linux
arch: arm64
runner: ubuntu-24.04-arm
- os: macos
arch: x86_64
runner: macos-13
- os: macos
arch: arm64
runner: macos-14
steps:
- uses: actions/checkout@v4

- name: Install Nim
uses: jiro4989/setup-nim-action@v2
with:
nim-version: stable

- name: Install system deps (Linux)
if: runner.os == 'Linux'
run: sudo apt-get update && sudo apt-get install -y libsqlite3-dev

- name: Build
run: nimble release

- name: Smoke test
run: |
./build/lazybookmarks --help
./build/lazybookmarks status

- uses: actions/upload-artifact@v4
with:
name: lazybookmarks-${{ matrix.os }}-${{ matrix.arch }}
path: build/lazybookmarks
73 changes: 73 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
name: Release

on:
push:
tags: ["v*"]

permissions:
contents: write

jobs:
build:
name: ${{ matrix.os }}-${{ matrix.arch }}
runs-on: ${{ matrix.runner }}
strategy:
fail-fast: false
matrix:
include:
- os: linux
arch: x86_64
runner: ubuntu-24.04
- os: linux
arch: arm64
runner: ubuntu-24.04-arm
- os: macos
arch: x86_64
runner: macos-13
- os: macos
arch: arm64
runner: macos-14
steps:
- uses: actions/checkout@v4

- name: Install Nim
uses: jiro4989/setup-nim-action@v2
with:
nim-version: stable

- name: Install system deps (Linux)
if: runner.os == 'Linux'
run: sudo apt-get update && sudo apt-get install -y libsqlite3-dev

- name: Build
run: nimble release

- name: Package
run: |
mkdir -p dist
tar -czf "dist/lazybookmarks-${{ matrix.os }}-${{ matrix.arch }}.tar.gz" -C build lazybookmarks
( cd dist && shasum -a 256 "lazybookmarks-${{ matrix.os }}-${{ matrix.arch }}.tar.gz" > "lazybookmarks-${{ matrix.os }}-${{ matrix.arch }}.tar.gz.sha256" )

- uses: actions/upload-artifact@v4
with:
name: lazybookmarks-${{ matrix.os }}-${{ matrix.arch }}
path: dist/*

release:
name: Publish
needs: build
runs-on: ubuntu-24.04
steps:
- uses: actions/download-artifact@v4
with:
path: dist
merge-multiple: true

- name: Generate SHA256SUMS
run: ( cd dist && cat *.sha256 > SHA256SUMS )

- name: Create release
uses: softprops/action-gh-release@v2
with:
generate_release_notes: true
files: dist/*
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,6 @@
*.gguf
/*.html
/build/
/dist/
/lazybookmarks/
/.opencode/plans
141 changes: 141 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
# AGENTS.md — Lazybookmarks Development Guide

## Project Overview
- Chrome extension (AI-powered bookmark organizer using Gemini Nano) ported to standalone Nim CLI
- Target: Linux arm64, dev on macOS arm64
- Dependencies: cligen, db_connector, jsony (only 3)
- Build: `nimble release` (NOT `nimble build`)

## Build & Toolchain
- Nim 2.2.8 via Homebrew
- `nimble release` outputs to `build/lazybookmarks`
- Do NOT use `nimble build` — it ignores custom tasks
- `nim.cfg`: --opt:size, --mm:orc, NO -d:ssl
- `nimble release` auto-resolves dependencies

## LLM Backend
- Default: Ollama native `/api/chat` (constrained decoding via `format` param)
- OpenAI-compatible `/v1/chat/completions` fallback when `runtimeManaged=false` (LLM_URL set)
- `format` param = grammar-based constrained decoding (model physically cannot generate invalid tokens)
- `"options": {"think": false}` suppresses qwen3.5 thinking mode
- Model lineup: qwen3.5:0.8b, qwen3.5:2b (default), qwen3.5:4b, gemma4:e2b
- Model managed via `ollama pull`, not custom download code

## Architecture
- Config priority: CLI > env vars > config.toml > defaults
- `runtimeManaged` flag: true = Ollama (native endpoints), false = custom LLM_URL (OpenAI endpoints)
- Link checking: `curl` + `xargs -P` (Nim SSL broken with OpenSSL 3.6+)
- `__skip__` handling: bookmarks classified as `__skip__` remain `organised_at = NULL`

## 3-Phase Pipeline Internals

The organize command (`organizer.nim`) runs a 3-phase pipeline to classify bookmarks into folders.

### Phase 1: Taxonomy Analysis (`runTaxonomyPhase`)
- **Input:** All folders with their bookmarks, enriched with TF-IDF keywords, domain patterns, and exemplar bookmarks
- **Process:** Single LLM call asking the model to describe each folder and provide keywords
- **Schema:** `TaxonomySchemaJson` — array of `{folderId, folderPath, description, keywords[]}`
- **Caching:** Results keyed by a fingerprint of folder UUIDs + bookmark counts (`buildFingerprint`). Cache stored in `taxonomy_cache` table. Survives across runs unless folder structure changes.
- **Key helpers:** `computeTFIDF` (term frequency-inverse document frequency per folder), `extractDomainPatterns` (top domains per folder above 20% threshold), `sampleExemplars` (2 most recent bookmark titles/urls per folder)

### Phase 1.5: Cluster/Theme Grouping (`runClusterPhase`)
- **Input:** All unorganized bookmarks, existing taxonomy categories, root-level folders
- **Process:** Single LLM call to identify 2-6 thematic groups among unorganized bookmarks that deserve a new folder
- **Schema:** `buildClusterSchemaJson(rootFolderIds)` — array of `{name, description, keywords[], parentFolderId}`. The `parentFolderId` is constrained to root folder UUIDs via JSON enum.
- **Output:** `seq[ClusterSuggestion]` — these become synthetic folders prefixed with `__new_` (e.g., `__new_Hardware`) in the taxonomy for Phase 2

### Phase 2: Per-Bookmark Classification (`runClassificationPhase`)
- **Input:** Unorganized bookmarks (chunked into batches), full taxonomy (original + new cluster folders)
- **Process:** For each batch, calls `pruneTaxonomy` to reduce the folder list to the most relevant ~15 folders (based on keyword overlap with the batch's titles via TF-IDF), then asks the LLM to classify each bookmark
- **Schema:** `buildClassificationSchemaJson(folderIds, bookmarkIds)` — array of `{bookmarkId, targetFolderId, confidence, reason}`. Both `bookmarkId` and `targetFolderId` are constrained to exact IDs via JSON enum, plus `"__skip__"` as a valid target.
- **Concurrency:** Uses `classifyBatchAsync` with sliding window — up to `concurrency` (default 4) batches in flight simultaneously via `AsyncHttpClient`. Batch size auto-set by model size (5 for small, 10 for normal).
- **`pruneTaxonomy`:** Scores each taxonomy category by how many of its TF-IDF keywords appear in the batch's token set. Keeps top N (max 15, min 5). This prevents overwhelming small models with 30+ folder options.
- **Bookmarks classified as `__skip__`** are silently dropped (not applied). Low-confidence matches can be reviewed interactively or auto-skipped.

### Data Flow
1. `organizeBookmarks` loads all bookmarks, builds `folderBookmarks` table mapping folder UUID → bookmark entries
2. Phase 1 enriches folders with TF-IDF/domain/exemplar data, runs LLM, caches result
3. Phase 1.5 takes unorganized bookmarks + taxonomy + root folders, suggests new folders
4. Phase 2 merges new folders into taxonomy, chunks unorganized bookmarks, classifies each batch with pruned taxonomy
5. Results become `seq[Suggestion]` with `bookmarkId`, `targetFolderId`, `targetFolderPath`, `confidence`, `reason`, `isNewFolder`
6. Suggestions are applied via `applyClassification` (sets `category`, `confidence`, `reason`, `organised_at` on the bookmark row)

### Key Types
- `TaxonomyCategory`: folderId, folderPath, description, keywords
- `ClusterSuggestion`: name, description, keywords, parentFolderId
- `Classification`: bookmarkId, targetFolderId, confidence, reason
- `Suggestion`: bookmarkId, bookmarkTitle, bookmarkUrl, targetFolderId, targetFolderPath, confidence, reason, isNewFolder

## Nim Language Pitfalls (The Hard-Won Lessons)

### Syntax
- `.[^1]` is Python, not Nim — use `seq[seq.len - 1]`
- `mapIt` cannot have multi-line blocks — use explicit `for` loops
- `=>` lambda syntax doesn't exist in Nim
- Anonymous tuple fields can't be accessed by name (`.score`) — use `[0]`, `[1]`
- `findIt` on seqs returns `int` (index), not the element — use `seq[index]`
- `{}` set literals only support values 0..255 — HTTP codes 302/404/410 must use `==` chains
- `re.match` requires full-string match — use `re.find` for substring matching
- `&"..."` format strings require `strformat` import
- Variable names can't conflict with keywords (e.g., `file` parameter)

### Standard Library
- `std/terminal` has `hideCursor`/`showCursor` templates that conflict with custom procs
- `postContent` doesn't take a `headers` param — set `client.headers` before calling
- `HttpClient` has no `onProgress` field
- `filterIt`/`mapIt` are in `std/sequtils`, NOT `std/sugar` in Nim 2.2.8
- `split()` requires `strutils` import
- `parseFloat` requires `strutils` import
- `sort()` requires `std/algorithm` import
- `sum()` doesn't exist as a standalone proc — use manual loop with `.inc`
- `sleep` → `os.sleep` or `execShellCmd`
- `execShellCmd` is in `std/os`, not `std/osproc`
- `/` operator is for filesystem paths, not URL concatenation — use `&`
- `reversed()` doesn't exist — use manual reverse loop with index
- `rfind` doesn't exist — use manual loop or reverse approach
- `chunk` doesn't exist — implement manually
- `formatFloat` uses `precision` not `ffDecimal` named param

### Async
- `AsyncHttpClient` is inside `std/httpclient`, no built-in timeout
- `withTimeout(fut, ms)` returns `Future[bool]` — true if completed
- `one()` doesn't exist — use polling with `sleepAsync` + `.finished`
- Async macro can't capture `var` parameters — use return values
- `pump()` closures capturing locals from enclosing procs violate borrow checker — inline
- `std/channels` doesn't exist in 2.2.8; `threadpool` is deprecated
- `waitFor()` requires `std/asyncdispatch`

### JSON & Data
- `HttpHeaders` doesn't have 3-arg `get(key, default)` — use `hasKey` + `[]`
- `HttpCode` is `range[0..255]` — can't use in `{}`
- `toHex` from `nimcrypto/utils` returns uppercase — `toLowerAscii()` for comparisons

### Build System
- `nimble build` always runs its own default build via `bin` field — custom `build` tasks are ignored
- `self.exec` is old nimble syntax — use just `exec`
- `--out:path` doesn't work in `nim.cfg` — only `--outdir:dir`
- Circular imports cause "undeclared identifier" — restructure to avoid cycles
- Forward declarations needed for procs called before their definition
- Inline `if` in string concatenation within proc call arguments doesn't work — extract to `let` binding

### SSL
- Nim 2.2.8 SSL bindings are incompatible with OpenSSL 3.6+
- Do NOT add `-d:ssl` — use `curl` via `execShellCmd` for HTTPS

## Code Conventions
- No comments unless asked
- Imports grouped: std/*, then local ./* modules
- Procs use `*` export marker for public API
- CLI subcommands use `cmdXxx` naming, flat names in dispatchMulti (e.g., `model-list`)

## Testing
- Manual e2e testing with real bookmark data (692 bookmarks)
- `./build/lazybookmarks <command> --verbose` for debug output
- Smoke test: `./build/lazybookmarks status` / `./build/lazybookmarks --help`

## Ollama API Reference (Native Endpoints)
- Chat: `POST /api/chat` with `format` for structured output, `stream: false`
- Models: `GET /api/tags`
- Health: `GET /api/tags` (200 = running)
- Pull: `ollama pull <model>:<tag>` (CLI, not API)
- Response: `{"message": {"content": "..."}}` (not OpenAI's `choices[0].message.content`)
43 changes: 43 additions & 0 deletions BUILD.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Building lazybookmarks

## Prerequisites

- **Nim** >= 2.0.0 — https://nim-lang.org/install.html
- **Ollama** — https://ollama.com/download (runtime dependency, not build-time)
- **curl** — required at runtime for `check-links` (link health checking)

### macOS

```sh
brew install nim ollama curl
```

### Ubuntu/Debian

```sh
sudo apt install nim curl
curl -fsSL https://ollama.com/install.sh | sh
```

### Arch Linux

```sh
sudo pacman -S nim curl
yay -S ollama-cuda # or ollama-rocm for AMD
```

## Build

```sh
nimble release
```

The binary will be at `build/lazybookmarks`.

## Cross-compiling for Linux (from macOS)

Install a Linux cross-compiler, then:

```sh
nim c -d:release --os:linux --cpu:arm64 -o:build/lazybookmarks-linux-arm64 src/lazybookmarks/main.nim
```
24 changes: 24 additions & 0 deletions assets/models.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
{
"entries": [
{
"name": "qwen3.5-0.8b",
"ollamaModel": "qwen3.5",
"ollamaTag": "0.8b"
},
{
"name": "qwen3.5-2b",
"ollamaModel": "qwen3.5",
"ollamaTag": "2b"
},
{
"name": "qwen3.5-4b",
"ollamaModel": "qwen3.5",
"ollamaTag": "4b"
},
{
"name": "gemma4-e2b",
"ollamaModel": "gemma4",
"ollamaTag": "e2b"
}
]
}
23 changes: 23 additions & 0 deletions lazybookmarks.nimble
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Package

version = "0.1.0"
author = "corv89"
description = "CLI bookmark organizer powered by local LLM"
license = "MIT"
srcDir = "src"
bin = @["lazybookmarks/main"]

# Dependencies

requires "nim >= 2.0.0"
requires "cligen >= 1.6"
requires "db_connector >= 0.1"
requires "jsony >= 1.1"

# Tasks

task release, "Build release binary to build/":
exec "nim c -d:release -o:build/lazybookmarks src/lazybookmarks/main.nim"

task debug, "Build debug binary to build/":
exec "nim c -o:build/lazybookmarks src/lazybookmarks/main.nim"
2 changes: 2 additions & 0 deletions nim.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
--opt:size
--mm:orc
18 changes: 18 additions & 0 deletions src/lazybookmarks/bootstrap.nim
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
import ./config
import ./model
import ./runtime
import ./ui

proc ensureReady*(cfg: var Config, registry: ModelRegistry) =
if not cfg.runtimeManaged:
return

requireRuntime(cfg)

let entry = findModel(registry, cfg.modelVariant)
cfg.modelName = ollamaRef(entry)

if not isEntryReady(entry, cfg):
pullModel(entry)
else:
infoMsg "Model ready: " & ollamaRef(entry)
Loading