Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,3 +105,20 @@ You can also pass any native ClickHouse format name directly.
|------|-------------|
| `--help` | Show help text |
| `--version` | Print version |

## Agent Skill

chcli includes an [Agent Skill](https://agentskills.io) so AI coding agents (Claude Code, Cursor, etc.) can query ClickHouse databases on your behalf.

Install the skill:

```bash
npx skills add obsessiondb/chcli
```

This gives your agent the ability to:

- Run SQL queries against ClickHouse using `chcli`
- Explore database schemas (`SHOW TABLES`, `DESCRIBE TABLE`)
- Extract data in machine-readable formats (JSON, CSV)
- Follow best practices like using `LIMIT` and structured output formats
147 changes: 147 additions & 0 deletions skills/clickhouse-query/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
---
name: clickhouse-query
description: Query ClickHouse databases using the chcli CLI tool. Use when the user wants to run SQL queries against ClickHouse, explore database schemas, inspect tables, or extract data from ClickHouse.
metadata:
author: obsessiondb
version: "1.0"
compatibility: Requires bun or node (for bunx/npx). Needs network access to a ClickHouse instance.
allowed-tools: Bash(bunx chcli:*) Bash(npx chcli:*) Bash(chcli:*) Read Write
---

# chcli — ClickHouse CLI

chcli is a lightweight ClickHouse command-line client. Use it to run SQL queries, explore schemas, and extract data from ClickHouse databases.

## Running chcli

Prefer `bunx` if Bun is available, otherwise use `npx`:

```bash
bunx chcli -q "SELECT 1"
npx chcli -q "SELECT 1"
```

Or install globally:

```bash
bun install -g chcli
chcli -q "SELECT 1"
```

## Connection

Set connection details via environment variables (preferred for agent use) or CLI flags. CLI flags override env vars.

| Flag | Env Var | Default |
|------|---------|---------|
| `--host` | `CLICKHOUSE_HOST` | `localhost` |
| `--port` | `CLICKHOUSE_PORT` | `8123` |
| `-u, --user` | `CLICKHOUSE_USER` | `default` |
| `--password` | `CLICKHOUSE_PASSWORD` | *(empty)* |
| `-d, --database` | `CLICKHOUSE_DATABASE` | `default` |
| `-s, --secure` | `CLICKHOUSE_SECURE` | `false` |

For agent workflows, prefer setting env vars in a `.env` file (Bun loads `.env` automatically) so every invocation uses the same connection without repeating flags.

See `references/connection.md` for detailed connection examples.

## Query Patterns

**Inline query** (most common for agents):

```bash
bunx chcli -q "SELECT count() FROM events"
```

**From a SQL file:**

```bash
bunx chcli -f query.sql
```

**Via stdin pipe:**

```bash
echo "SELECT 1" | bunx chcli
```

## Output Formats

**Always use `-F json` or `-F csv` when the output will be parsed by an agent.** The default format (`pretty`) is for human display and is difficult to parse programmatically.

```bash
# JSON — best for structured parsing
bunx chcli -q "SELECT * FROM events LIMIT 5" -F json

# CSV — good for tabular data
bunx chcli -q "SELECT * FROM events LIMIT 5" -F csv

# JSONL (one JSON object per line) — good for streaming/large results
bunx chcli -q "SELECT * FROM events LIMIT 100" -F jsonl
```

Available format aliases: `json`, `jsonl`/`ndjson`, `jsoncompact`, `csv`, `tsv`, `pretty`, `vertical`, `markdown`, `sql`. Any native ClickHouse format name also works.

See `references/formats.md` for the full format reference.

## Common Workflows

### Schema Discovery

```bash
# List all databases
bunx chcli -q "SHOW DATABASES" -F json

# List tables in current database
bunx chcli -q "SHOW TABLES" -F json

# List tables in a specific database
bunx chcli -q "SHOW TABLES FROM analytics" -F json

# Describe table schema
bunx chcli -q "DESCRIBE TABLE events" -F json

# Show CREATE TABLE statement
bunx chcli -q "SHOW CREATE TABLE events"
```

### Data Exploration

```bash
# Row count
bunx chcli -q "SELECT count() FROM events" -F json

# Sample rows
bunx chcli -q "SELECT * FROM events LIMIT 10" -F json

# Column statistics
bunx chcli -q "SELECT uniq(user_id), min(created_at), max(created_at) FROM events" -F json
```

### Data Extraction

```bash
# Extract to CSV file
bunx chcli -q "SELECT * FROM events WHERE date = '2024-01-01'" -F csv > export.csv

# Extract as JSON
bunx chcli -q "SELECT * FROM events LIMIT 1000" -F json > export.json
```

## Additional Flags

| Flag | Description |
|------|-------------|
| `-t, --time` | Print execution time to stderr |
| `-v, --verbose` | Print query metadata (format, elapsed time) to stderr |
| `--help` | Show help text |
| `--version` | Print version |

## Best Practices for Agents

1. **Always specify `-F json` or `-F csv`** — never rely on the default format, which varies by TTY context.
2. **Always use `LIMIT`** on SELECT queries unless you know the table is small. ClickHouse tables can contain billions of rows.
3. **Start with schema discovery** — run `SHOW TABLES` and `DESCRIBE TABLE` before querying unfamiliar databases.
4. **Use `-t` for timing** — helps gauge whether queries are efficient.
5. **Prefer env vars for connection** — set them once in `.env` rather than repeating flags on every command.
6. **Use `count()` first** — before extracting data, check how many rows match to avoid overwhelming output.
94 changes: 94 additions & 0 deletions skills/clickhouse-query/references/connection.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Connection Configuration Reference

chcli connects to ClickHouse over HTTP(S). Connection details can be set via environment variables or CLI flags.

## Precedence

CLI flags take precedence over environment variables. If neither is set, the default value is used.

```
CLI flag > Environment variable > Default value
```

## Configuration Options

| Flag | Env Var | Default | Description |
|------|---------|---------|-------------|
| `--host <host>` | `CLICKHOUSE_HOST` | `localhost` | ClickHouse server hostname or IP |
| `--port <port>` | `CLICKHOUSE_PORT` | `8123` | HTTP interface port |
| `-u, --user <user>` | `CLICKHOUSE_USER` | `default` | Authentication username |
| `--password <pass>` | `CLICKHOUSE_PASSWORD` | *(empty)* | Authentication password |
| `-d, --database <db>` | `CLICKHOUSE_DATABASE` | `default` | Default database for queries |
| `-s, --secure` | `CLICKHOUSE_SECURE` | `false` | Use HTTPS instead of HTTP |

## Connection URL

chcli constructs the connection URL as:

```
{protocol}://{host}:{port}
```

Where `protocol` is `https` if `--secure` is set or `CLICKHOUSE_SECURE=true`, otherwise `http`.

## Examples

### Local Development (defaults)

No configuration needed — connects to `http://localhost:8123` with user `default`:

```bash
bunx chcli -q "SELECT 1"
```

### Remote Instance via CLI Flags

```bash
bunx chcli \
--host ch.example.com \
--port 8443 \
--secure \
--user admin \
--password secret \
-d analytics \
-q "SELECT count() FROM events"
```

### Remote Instance via Environment Variables

Create a `.env` file (Bun loads it automatically):

```env
CLICKHOUSE_HOST=ch.example.com
CLICKHOUSE_PORT=8443
CLICKHOUSE_SECURE=true
CLICKHOUSE_USER=admin
CLICKHOUSE_PASSWORD=secret
CLICKHOUSE_DATABASE=analytics
```

Then run queries without connection flags:

```bash
bunx chcli -q "SELECT count() FROM events"
```

### ClickHouse Cloud

ClickHouse Cloud uses HTTPS on port 8443:

```env
CLICKHOUSE_HOST=abc123.us-east-1.aws.clickhouse.cloud
CLICKHOUSE_PORT=8443
CLICKHOUSE_SECURE=true
CLICKHOUSE_USER=default
CLICKHOUSE_PASSWORD=your-password
```

### Mixed (Env Vars + Flag Override)

Set base connection in `.env`, override database per-query:

```bash
bunx chcli -d other_db -q "SHOW TABLES"
```
51 changes: 51 additions & 0 deletions skills/clickhouse-query/references/formats.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Output Formats Reference

chcli supports format aliases that map to ClickHouse native format names. Use the `-F` / `--format` flag to specify the output format.

## Default Behavior

- **TTY (interactive terminal):** `pretty` (PrettyCompactMonoBlock) — human-readable tables
- **Piped/redirected output:** `tsv` (TabSeparatedWithNames) — machine-friendly tab-separated values

## Format Alias Table

| Alias | ClickHouse Format | Description |
|-------|-------------------|-------------|
| `json` | JSON | Full JSON object with `data` array, column metadata, row count, and statistics |
| `jsonl` | JSONEachRow | One JSON object per line (newline-delimited JSON) |
| `ndjson` | JSONEachRow | Alias for `jsonl` |
| `jsoncompact` | JSONCompact | JSON with column names separate from row data (arrays instead of objects) |
| `csv` | CSVWithNames | Comma-separated values with a header row |
| `tsv` | TabSeparatedWithNames | Tab-separated values with a header row |
| `pretty` | PrettyCompactMonoBlock | Human-readable bordered table |
| `vertical` | Vertical | Each column on its own line (useful for wide rows) |
| `markdown` | Markdown | Markdown-formatted table |
| `sql` | SQLInsert | Output as SQL INSERT statements |

Any format name not in this table is passed directly to ClickHouse, so you can use any native ClickHouse format (e.g., `Parquet`, `Arrow`, `Avro`).

## Choosing a Format

### For Agent/Programmatic Use

- **`json`** — Best for structured parsing. Returns a complete JSON object:
```json
{
"meta": [{"name": "count()", "type": "UInt64"}],
"data": [{"count()": "42"}],
"rows": 1,
"statistics": {"elapsed": 0.001, "rows_read": 100}
}
```
- **`jsonl`** — Best for large result sets or streaming. One JSON object per line:
```
{"id": 1, "name": "Alice"}
{"id": 2, "name": "Bob"}
```
- **`csv`** — Good for tabular data and import/export workflows.

### For Human Display

- **`pretty`** — Default in terminals. Bordered table layout.
- **`vertical`** — Useful when rows have many columns. Each row displayed vertically.
- **`markdown`** — Useful for embedding query results in documentation.
Loading