Build Canada legislative review tracker and batch review pipeline for the Canadian federal legislative corpus.
Production is split into two runtimes:
- Cloudflare Workers serves the Next.js dashboard.
- A separate Python worker machine runs the review pipeline and publishes dashboard artifacts to Cloudflare R2.
The frontend reads two JSON artifacts from R2:
review-summary.jsonreview-details.json
Local development still works with mirrored files in src/data/.
Prerequisites:
- Node 20.x
- Wrangler access to your Cloudflare account
- Environment-specific R2 buckets, or update wrangler.jsonc
Install and deploy:
npm ci
npx wrangler login
npx wrangler r2 bucket create legislative-review-data
npm run deployEnvironment layout in wrangler.jsonc:
- default/local:
core+legislative-review-data - staging:
core-staging+legislative-review-data-staging - production:
core-production+legislative-review-data-production
Create the environment buckets:
npx wrangler r2 bucket create legislative-review-data-staging
npx wrangler r2 bucket create legislative-review-data-productionPreview or deploy by environment:
npm run preview:staging
npm run deploy:staging
npm run deploy:productionEach environment is configured with:
LEGISLATIVE_REVIEW_DATA_BUCKETLEGISLATIVE_REVIEW_SUMMARY_KEY=review-summary.jsonLEGISLATIVE_REVIEW_DETAILS_KEY=review-details.json
Prerequisites:
- Python 3.10+
- Access to the raw and processed parquet paths under a configurable dataset root
- Anthropic API key
- Cloudflare R2 API credentials for the Python publisher
Create an environment and install dependencies:
python -m venv .venv
. .venv/bin/activate
pip install -r requirements.txtOn Windows PowerShell:
python -m venv .venv
.venv\Scripts\Activate.ps1
pip install -r requirements.txtCopy .env.example to .env and fill in:
LEGISLATIVE_REVIEW_DATA_ROOT- optionally
LEGISLATIVE_REVIEW_PROCESSED_DIR CLAUDE_API_KEYCLOUDFLARE_R2_ACCOUNT_IDCLOUDFLARE_R2_BUCKETCLOUDFLARE_R2_ENDPOINTCLOUDFLARE_R2_ACCESS_KEY_IDCLOUDFLARE_R2_SECRET_ACCESS_KEY
The Python scripts now derive dataset paths from:
LEGISLATIVE_REVIEW_DATA_ROOTLEGISLATIVE_REVIEW_PROCESSED_DIRif you need processed artifacts in a separate mount
If you do not set them, the scripts fall back to the original Windows development path.
Run the end-to-end review pipeline for one domain:
python scripts/run_review_frontend_pipeline.py --domain transport_infrastructureResume behavior is enabled by default. If the worker stops mid-run, restarting the same command resumes from the last durable success using the existing review parquet plus a journal file written beside it. Use --no-resume only when you intentionally want to restart the batch from scratch.
Run a smaller batch:
python scripts/run_review_frontend_pipeline.py --domain transport_infrastructure --limit 50During review, review_documents.py checkpoints:
- the parquet review output
- a per-review resume journal beside the parquet output
- local mirrored frontend JSON
- R2 dashboard JSON, if
CLOUDFLARE_R2_*variables are configured
Run the app locally:
npm run devOpen:
http://localhost:3000/legislative-reviews
The dashboard polls /api/legislative-reviews every few seconds. In local development that route falls back to src/data/review-summary.json and src/data/review-details.json if Cloudflare bindings are unavailable.
An example systemd service is included at legislative-reviews.service.
Typical Linux install:
sudo cp deploy/systemd/legislative-reviews.service /etc/systemd/system/legislative-reviews.service
sudo systemctl daemon-reload
sudo systemctl enable legislative-reviews
sudo systemctl start legislative-reviews
sudo systemctl status legislative-reviews
journalctl -u legislative-reviews -fAdjust:
UserWorkingDirectoryEnvironmentFileExecStart
before enabling the service.
- Cloudflare Workers is the correct place for the dashboard, not for the long-running Python batch process.
- The dashboard is now production-ready for shared storage via R2.
- The Python publisher uses atomic local writes and uploads
review-details.jsonbeforereview-summary.jsonto reduce transient mismatch windows.