Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -285,6 +285,8 @@
"self-host/self-host-lightdash-docker-compose",
"self-host/self-host-lightdash-restack",
"self-host/update-lightdash",
"self-host/pre-aggregates",
"self-host/nats-workers",
{
"group": "Customize deployment",
"pages": [
Expand Down
41 changes: 41 additions & 0 deletions self-host/nats-workers.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
---
title: "NATS workers"
description: "Scale Lightdash query processing with dedicated NATS worker pods using the Helm chart."
sidebarTitle: "NATS workers"
---

<Badge color="blue" size="md" shape="pill">Helm chart</Badge>

<Callout icon="wrench" color="#6B7280">
This page is for engineering teams self-hosting their own Lightdash instance. If you want to get started with pre-aggregates, see the [pre-aggregates reference](/references/pre-aggregates).
</Callout>

<Warning>
NATS workers are only recommended for large deployments and should be set up with guidance from the Lightdash team.
</Warning>

NATS moves warehouse query execution off the main Lightdash server and onto dedicated worker pods. This improves responsiveness under load and lets you scale query capacity independently.

## Enabling NATS workers

You should be using the [Helm chart](/self-host/self-host-lightdash) to deploy Lightdash with NATS workers.

```yaml
nats:
enabled: true
warehouseNatsWorker:
enabled: true
```

## Scaling

```yaml
warehouseNatsWorker:
replicas: 2 # more pods = more parallel query capacity
concurrency: 100 # concurrent jobs per pod
resources:
requests:
memory: 1.5Gi
cpu: 250m
ephemeral-storage: 9Gi
```
84 changes: 84 additions & 0 deletions self-host/pre-aggregates.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
---
title: "Pre-aggregates"
description: "Deploy Lightdash with pre-aggregates using the Helm chart to serve queries from DuckDB instead of your data warehouse."
sidebarTitle: "Pre-aggregates"
---

<Badge color="purple" size="md" shape="pill">Enterprise plan</Badge> <Badge color="blue" size="md" shape="pill">Helm chart</Badge>

<Callout icon="wrench" color="#6B7280">
This page is for engineering teams self-hosting their own Lightdash instance. If you want to get started with pre-aggregates, see the [pre-aggregates reference](/references/pre-aggregates).
</Callout>

We recommend deploying Lightdash with pre-aggregates using the [Helm chart](/self-host/self-host-lightdash). The Helm chart handles the required service dependencies and environment variable wiring automatically.

## Enabling pre-aggregates

Pre-aggregates materialize query results so that repeated queries are served from DuckDB instead of hitting your data warehouse. This requires NATS for async job processing and S3-compatible storage for materialized results.

### Prerequisites

- A valid Lightdash license key
- An S3-compatible bucket (AWS S3, GCS, MinIO, etc.)

### Helm values

Setting these three values in your Helm values is the minimum required configuration:

```yaml
# Enable NATS and workers
nats:
enabled: true
warehouseNatsWorker:
enabled: true
preAggregateNatsWorker:
enabled: true

# License key
secrets:
LIGHTDASH_LICENSE_KEY: "your-license-key"

# S3 storage for materialized results
configMap:
S3_ENDPOINT: "https://s3.us-east-1.amazonaws.com"
S3_REGION: "us-east-1"
PRE_AGGREGATE_RESULTS_S3_BUCKET: "my-lightdash-pre-aggs"

secrets:
S3_ACCESS_KEY: "your-access-key"
S3_SECRET_KEY: "your-secret-key"
```

The chart auto-configures `NATS_ENABLED`, `PRE_AGGREGATES_ENABLED`, `NATS_URL`, and `PRE_AGGREGATES_PARQUET_ENABLED` from the flags above.

## What gets deployed

| Component | Purpose |
| --- | --- |
| NATS JetStream | Message broker for async query jobs |
| Warehouse worker | Processes interactive queries from users |
| Pre-aggregate worker | Materializes pre-aggregates and processes DuckDB queries |

Warehouse and pre-aggregate workers are separate deployments so they don't compete for resources.

## Scaling

The defaults are tuned for typical workloads. The main levers if you need to adjust:

```yaml
warehouseNatsWorker:
replicas: 1 # scale horizontally for more concurrent queries
concurrency: 100 # concurrent jobs per pod

preAggregateNatsWorker:
replicas: 1
concurrency: 100
```

Pre-aggregate workers are more resource-intensive than warehouse workers because they run DuckDB. The default resource requests reflect this:

| | Warehouse worker | Pre-aggregate worker |
| --- | --- | --- |
| CPU | 250m | 650m |
| Memory | 1.5Gi | 4Gi |
| Ephemeral storage | 9Gi | 9Gi |
Loading