Human AI Eyes (pronounced "Human Eye") — a shared browser for human-AI collaboration.
AI browser tools are invisible. The agent navigates a headless browser you can't see, gets stuck on a captcha, and sits there burning tokens until it times out. There's no way to watch what it's doing, no way to help, and definitely no way to teach it where to click next time.
HumanAIE fixes that. It runs a headless Chromium instance and streams the viewport to a retro Netscape Navigator-themed UI in your browser. You watch the AI work in real-time. When it hits a captcha or can't find a button, it pings you -- notification chime, browser notification, yellow banner. You click through the captcha, the AI reads the coordinates and keeps going. No copy-pasting screenshots, no back-and-forth describing what's on screen.
The highlight-to-teach system is what makes this different from everything else out there. You highlight an element on the page, and HumanAIE logs those coordinates tied to the URL and a label. Next time the AI visits that page, it already knows where the login button is. You're building a spatial memory for your agent just by clicking.
git clone https://github.com/datboip/HumanAIE.git
cd HumanAIE
npm install && npx playwright install chromium
npm startOpen http://localhost:3333/cam/ and you're live.
docker build -t humanaie .
docker run -p 3333:3333 humanaieOne-liner for fresh machines:
curl -fsSL https://raw.githubusercontent.com/datboip/HumanAIE/main/install.sh | bashAI Agent <--> REST API <--> HumanAIE Server <--> Headless Browser
|
Cam UI (you watch here)
|
Highlight --> AI learns coordinates
- HumanAIE starts a headless Chromium and an Express server on port 3333
- AI agent controls the browser via REST API (navigate, click, type, scroll)
- You watch at
/cam/-- real-time MJPEG stream or frame polling - When the AI gets stuck, it calls
/waitfor-highlightand you get a notification - You click/highlight what it needs, the AI reads the coordinates and continues
- Highlights are logged per-URL so the AI remembers for next time
Browser Controls
- Back, forward, navigate to any URL
- Configurable viewport resolution
AI Interaction
- Click, scroll, type, key press at coordinates
- Hover, fill form fields
- Wait for selectors
Highlight-to-Teach
- Human highlights elements on the page
- Coordinates + label + URL stored automatically
- AI queries highlight history to find known elements
- Builds spatial memory over time -- genuinely new, no other tool has this
Session Recording
- Record browser sessions as MP4 video
- Download, rename, and trim recordings
Notifications
- Audio chime when AI needs help
- Browser push notifications
- Yellow banner overlay in the Cam UI
Takeover / Release
- Human takes full control of the browser
- Release back to AI when done
- Seamless handoff, no restart needed
Tabs
- Create, switch, close tabs
- Full tab list with active indicator
History
- Browsing history with timestamps
- Frequent sites quick-launch
- Clear history
Streaming
- MJPEG stream (
/stream) - Single frame JPEG (
/frame.jpg) - Screenshot PNG (
/screenshot) - SSE event stream (
/events)
| Method | Endpoint | Body | Description |
|---|---|---|---|
| POST | /navigate |
{ "url": "https://..." } |
Navigate to URL |
| POST | /back |
-- | Go back |
| POST | /forward |
-- | Go forward |
| GET | /refresh |
-- | Get current frame as screenshot JSON |
| Method | Endpoint | Body | Description |
|---|---|---|---|
| POST | /live/click |
{ "x": 100, "y": 200 } |
Click at coordinates |
| POST | /live/type |
{ "text": "hello" } |
Type text |
| POST | /live/scroll |
{ "direction": "down", "amount": 300 } |
Scroll by direction + amount |
| POST | /live/key |
{ "key": "Enter" } |
Press a key |
| POST | /type |
{ "text": "hello" } |
Type text (agent API) |
| POST | /hover |
{ "x": 100, "y": 200 } |
Hover at coordinates |
| POST | /scroll |
{ "deltaY": -300 } |
Scroll vertically (agent API) |
| POST | /key |
{ "key": "Enter" } |
Press key (agent API) |
| POST | /fill |
{ "selector": "#email", "value": "a@b.com" } |
Fill a form field |
| POST | /wait |
{ "selector": ".loaded" } |
Wait for selector |
| Method | Endpoint | Description |
|---|---|---|
| GET | /frame.jpg |
Current frame as JPEG |
| GET | /stream |
MJPEG stream |
| GET | /screenshot |
Full-page PNG screenshot |
| GET | /events |
SSE event stream (actions, status changes) |
| GET | /live/status |
Current live status (banner text, cursor pos) |
| Method | Endpoint | Body | Description |
|---|---|---|---|
| GET | /tabs |
-- | List all tabs |
| POST | /tabs/new |
{ "url": "https://..." } |
Open new tab |
| POST | /tabs/switch |
{ "id": 2 } |
Switch to tab |
| DELETE | /tabs/:id |
-- | Close tab |
| Method | Endpoint | Body | Description |
|---|---|---|---|
| POST | /highlight |
{ "x", "y", "label", "url" } |
Save a highlight |
| GET | /highlights |
-- | Get active highlights |
| DELETE | /highlights |
-- | Clear highlights |
| GET | /highlight-history |
?url=...&label=... |
Search saved highlights by URL or label |
| Method | Endpoint | Body | Description |
|---|---|---|---|
| POST | /waitfor-highlight |
{ "message": "Click the login button" } |
Ask human to highlight something |
| GET | /waitfor-highlight/status |
-- | Poll for human response |
| POST | /waitfor-highlight/done |
-- | Human submits highlight |
| Method | Endpoint | Body | Description |
|---|---|---|---|
| POST | /takeover |
-- | Human takes browser control |
| POST | /release |
-- | Release control back to AI |
| POST | /resize |
{ "width": 1280, "height": 720 } |
Resize viewport |
| GET | /viewport-size |
-- | Get current viewport dimensions |
| Method | Endpoint | Description |
|---|---|---|
| GET | /history |
Browsing history |
| GET | /history/frequent |
Most visited sites |
| DELETE | /history |
Clear history |
| Method | Endpoint | Description |
|---|---|---|
| POST | /record/start |
Start recording frames |
| POST | /record/stop |
Stop recording |
| GET | /record/status |
Recording status |
| GET | /sessions |
List recorded sessions |
| GET | /sessions/:id/mp4 |
Download session as MP4 |
| PATCH | /sessions/:id/rename |
Rename a session |
| POST | /sessions/:id/edit |
Edit session (trim, speed, crop) |
| DELETE | /sessions/:id |
Delete a session recording |
| Method | Endpoint | Description |
|---|---|---|
| GET | /version |
Server version info |
Add this to your AI agent's system prompt or tool instructions:
You have access to a shared browser via HumanAIE at http://localhost:3333.
To browse the web:
1. POST /navigate with {"url": "https://example.com"}
2. GET /screenshot to see the page
3. POST /live/click with {"x": 100, "y": 200} to click
4. POST /live/click with {"x": 300, "y": 150} to focus a field, then POST /type with {"text": "search query"} to type
When you can't find an element or hit a captcha:
1. POST /waitfor-highlight with {"message": "Please click the submit button"}
2. Poll GET /waitfor-highlight/status until the human responds
3. Use the returned coordinates to continue
Check highlight history before asking the human:
- GET /highlight-history?url=https://example.com
- If coordinates exist for this URL, use them directly
The human can see everything you do in real-time at /cam/.
Install as a Claude Code plugin:
cd ~/.claude/plugins
git clone https://github.com/datboip/HumanAIE.git humanaie| Command | Description |
|---|---|
/browse <url> |
Open a URL in the shared browser |
/screenshot |
Take a screenshot of the current page |
/click <x> <y> |
Click at coordinates |
/ask-human <message> |
Ask the user to highlight something on the page |
The Browser Control skill auto-activates when you mention browsing, websites, signing up, captchas, or web research.
| Variable | Default | Description |
|---|---|---|
HUMANAIE_PORT |
3333 |
Server port |
HUMANAIE_DATA_DIR |
. (cwd) |
Data directory for history, highlights, sessions |
HUMANAIE_USER |
(empty) | Basic auth username |
HUMANAIE_PASS |
(empty) | Basic auth password |
| Browser | Streaming | Notes |
|---|---|---|
| Chrome / Firefox | MJPEG | Native MJPEG support, smooth streaming |
| Brave | Frame polling | Brave blocks MJPEG, falls back to polling /frame.jpg |
| Safari | MJPEG | Works, occasionally needs a refresh |
The Cam UI auto-detects and picks the best method.
- Basic auth is available via
HUMANAIE_USER/HUMANAIE_PASSenv vars - Localhost requests bypass auth (the AI agent is local)
- The
/streamand/eventsSSE endpoints bypass auth to avoid browser credential prompts - If running on a shared network, set auth credentials and consider a reverse proxy with TLS
- Fork the repo
- Create a branch (
git checkout -b my-feature) - Make your changes
- Test locally (
npm start, open/cam/, try the API) - Open a PR
Keep it simple. No framework churn, no build steps. It's one server.js and a public/ folder.
MIT
