OpenClaw Gateway as alternative backend (memory, tools, multi-model)

Summary

Clicky is amazing as a screen-aware voice companion — the cursor overlay, push-to-talk, and POINT system are genuinely next-level UX. But right now it's a stateless Claude wrapper: no memory between sessions, no tools, no persistent context.

I'd like to propose adding OpenClaw (https://github.com/openclaw/openclaw) Gateway as an optional alternative backend — so Clicky can talk to a full personal AI agent instead of vanilla Claude.

What is OpenClaw?

OpenClaw is an open-source (MIT) self-hosted AI gateway that connects agents to messaging surfaces (WhatsApp, Telegram, Slack, Discord, etc.). It provides:
• Persistent memory — workspace files, conversation history across sessions
• Tool use — browser control, code execution, file system, git, cron jobs
• Multi-model — Claude, GPT, Gemini, and others via a single gateway
• Built-in TTS — ElevenLabs already integrated (reusable for Clicky)
• Skills system — extensible agent capabilities

It exposes a WebSocket Protocol (v3) that native apps already use (macOS menu bar app, iOS/Android nodes).

Proposed Architecture

Clicky (Swift) ←→ WS Protocol v3 ←→ OpenClaw Gateway ←→ Agent (memory + tools + skills)
The key idea: replace the Cloudflare Worker proxy with a direct WebSocket connection to OpenClaw Gateway. The Worker becomes optional (standalone fallback mode).

Changes in Clicky:

1. New OpenClawClient.swift — WebSocket client implementing Gateway Protocol v3 (connect, challenge/auth, chat.send, event streaming)
2. Refactor CompanionManager.swift — Provider pattern so users can choose between standalone mode (current Worker proxy, remains default) and OpenClaw mode (Gateway WebSocket)
3. TTS via Gateway — tts.convert / talk.speak RPC instead of direct ElevenLabs proxy
4. Settings panel — Gateway URL + auth token configuration
5. Screenshots as attachments — sent via chat.send (Gateway already handles image attachments)
6. POINT parsing unchanged — the [POINT:x,y:label] system stays exactly as-is

What stays the same:
All UI/UX, overlay, blue cursor, POINT animations, push-to-talk, AssemblyAI transcription, standalone mode works exactly as today (zero breaking changes), macOS-only scope.

What users get:
• Memory: in-memory → persistent workspace
• Tools: none → browser, exec, git, cron, etc.
• Context: conversation only → full agent context
• Model: Claude only → any model via Gateway
• API keys: 3 keys in Worker → centralized in Gateway
• Cross-device: no → same session from phone/desktop/web

Implementation plan:
Happy to implement this as a PR. ~500 lines for WS client, ~100 lines CompanionManager refactor, ~150 lines Settings UI. Standalone mode remains default.

Questions for maintainers:

1. Is a provider/backend abstraction pattern welcome?
2. Any preferences on Settings UI?
3. Should this be behind a feature flag initially?

Happy to discuss before writing code. I have a detailed architecture doc ready if useful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenClaw Gateway as alternative backend (memory, tools, multi-model) #30

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

OpenClaw Gateway as alternative backend (memory, tools, multi-model) #30

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions