Pyxis

Fire Risk Prevention and Response Optimization Engine City of San Francisco Fire Department Operations

Built on the Databricks Lakehouse Platform | April 2026

Project Overview

Pyxis is a production grade data intelligence platform that transforms 7.2 million raw municipal records from the San Francisco Open Data Portal into actionable fire risk analytics. The platform ingests five distinct city datasets (fire incidents, 911 calls, fire violations, building inspections, and building permits), resolves them into a unified property registry of 210,359 buildings, and scores every property in the city on a 0 to 100 risk scale using a hybrid Explainable AI ensemble.

The system directly addresses four critical operational needs of the San Francisco Fire Department:

Risk Prediction: Identifying which specific properties and neighborhoods face the highest fire risk before incidents occur
Response Optimization: Quantifying station level NFPA 1710 compliance and recommending resource reallocation based on geospatial risk density
Compliance Tracking: Measuring violation follow up effectiveness and exposing "dark properties" that have slipped through every inspection queue
Operational Intelligence: Delivering pre computed, decision ready metrics through a real time geospatial dashboard

For complete technical details, refer to the Technical Documentation.

Platform Screenshots

Dashboard Overview

Geospatial Risk Prediction Layer

Response Optimisation Layer

NFPA Response Compliance Tracking system

Fairness Coverage Analysis

XAI and Shap Analysis

Architecture

Medallion Architecture

Pyxis implements the Databricks Medallion Architecture pattern with three data layers:

                    San Francisco Open Data Portal
                    (5 Datasets | 7.2M Records)
                              |
                              v
    +-------------------------------------------------------+
    |                    BRONZE LAYER                        |
    |  Raw CSV ingestion with metadata tagging               |
    |  5 Delta tables preserving original data fidelity      |
    +-------------------------------------------------------+
                              |
                              v
    +-------------------------------------------------------+
    |                    SILVER LAYER                        |
    |  Entity resolution across 5 datasets via H3 indexing   |
    |  Feature engineering: per property + per hexagon        |
    |  Response time computation per station                  |
    |  210,359 unified property entities                      |
    +-------------------------------------------------------+
                              |
                              v
    +-------------------------------------------------------+
    |                     GOLD LAYER                         |
    |  8 business ready analytical tables                     |
    |  Heuristic risk scoring (0 to 100, 6 components)       |
    |  ML ensemble scoring (sklearn GBT + SHAP)              |
    |  Crisis zone mapping, dark property flagging            |
    |  NFPA compliance, fairness analysis, drift monitoring   |
    +-------------------------------------------------------+
                              |
                              v
    +-------------------------------------------------------+
    |                  PRESENTATION LAYER                    |
    |  JSON export via Databricks Volumes                     |
    |  React/Vite geospatial dashboard                       |
    +-------------------------------------------------------+

Data Pipeline Flow

The pipeline consists of 13 Databricks notebooks executed in strict dependency order:

Stage	Notebook	Purpose	Output
Bronze	01_bronze_ingest	Raw CSV to Delta conversion	5 Bronze tables
Silver	02_silver_entity_resolution	Cross dataset entity resolution with H3 spatial indexing	address_entity_master
Silver	03_silver_property_and_h3_features	Per property and per hexagon feature engineering	property_features, h3_features
Silver	04_silver_response_performance	Station level response time metrics	response_performance
Gold	05_gold_property_risk_twin	Heuristic risk scoring with explainable factor decomposition	property_risk_twin
Gold	05b_gold_ml_risk_model	ML model training, SHAP analysis, and ensemble scoring	property_risk_twin (updated)
Gold	06_gold_h3_risk_surface	H3 hexagonal risk aggregation and crisis zone flagging	h3_risk_surface
Gold	07_gold_dark_property_discovery	Dark property extraction and ranking	dark_property_discovery
Gold	08_gold_nfpa_response_compliance	Per station NFPA 1710 compliance with risk exposure	nfpa_response_compliance
Gold	09_gold_compliance_funnel	Violation resolution tracking by district	compliance_funnel
Gold	10_gold_fairness_coverage	Inspection equity analysis across districts	fairness_coverage
Gold	11_gold_model_health_drift	Feature distribution drift monitoring via PSI	model_health_drift
Export	99_export_gold_to_json	Hybrid sampled JSON export for frontend	8 JSON files

The Scoring Engine

Hybrid Explainable AI Ensemble

Pyxis uses a deliberate two layer scoring architecture designed for legal auditability in government operations.

Layer 1: Transparent Heuristic (90% of final score)

Every property receives a deterministic score from 0 to 100 based on six verifiable components:

Component	Max Points	Signal
Violations	30	Open and severe fire code violations on the property
Incidents	25	Structure fires and emergency incidents in the past 3 years
Inspections	25	Time elapsed since last inspection (exponential decay function)
Call Frequency	10	911 call volume in the surrounding area over 12 months
Permit Risk	10	Building permit type indicating high risk occupancy
Dark Penalty	8	Property has never been inspected despite active records

Each score is accompanied by three human readable explanation lines (e.g., "21 open violations unresolved", "1,235 days since last inspection") that provide legal justification for inspection prioritization.

Layer 2: Machine Learning Refinement (10% of final score)

An sklearn Gradient Boosted Tree classifier trained on 210,000 properties predicts structure fire probability using 8 non leaking features. SHAP TreeExplainer provides per property feature attribution, showing exactly which factors the model weighted most heavily.

Ensemble Formula: ensemble_risk_score = 0.9 * heuristic_score + 0.1 * ml_probability * 100

Why 90/10 and Not 60/40?

During development, we tested a 60/40 ensemble weighting. The ML model produced near universal high fire probabilities, reclassifying 83.3% of all properties as CRITICAL (compared to 1.6% under the pure heuristic). This is the exact false positive saturation phenomenon documented in the failure of LAPD PredPol (scrapped 2020) and NYPD COMPAS predictive systems. The ML model was detecting "data density" (properties with more city paperwork) rather than genuine fire physics.

We deliberately reweighted to 90/10, positioning the ML as a precision tie breaker rather than a primary classifier. This reduced CRITICAL to 8.6% while the ML still meaningfully up tiered 293 borderline properties from HIGH to CRITICAL by detecting non linear interaction patterns invisible to the static heuristic.

Tier	Heuristic Only	60/40 Ensemble (Rejected)	90/10 Ensemble (Final)
CRITICAL	3,408 (1.6%)	175,209 (83.3%)	18,065 (8.6%)
HIGH	185,307 (88.1%)	32,169 (15.3%)	171,912 (81.7%)
MEDIUM	21,240 (10.1%)	2,279 (1.1%)	20,114 (9.6%)
LOW	404 (0.2%)	702 (0.3%)	268 (0.1%)

Gold Layer Tables

The Gold layer contains 8 optimized Delta tables, each designed to power a specific operational decision:

1. property_risk_twin

Decision it drives: Which building should an inspector visit today, and why?

One row per property in San Francisco (210,359 total). Contains the heuristic score, ML fire probability, ensemble score, three heuristic explanation lines, three SHAP driver attributions, dark property flags, and a recommended action string. This is the primary table powering the geospatial dashboard.

2. h3_risk_surface

Decision it drives: Where should idle fire trucks pre position for fastest crisis response?

One row per H3 resolution 9 hexagon (~175 meter cells, ~1,200 total). Aggregates property risk scores, critical property counts, dark property density, and NFPA compliance rates per hexagon. Cells meeting multi factor crisis thresholds are flagged as Crisis Zones.

3. dark_property_discovery

Decision it drives: Which buildings are invisible to the inspection system?

Extracts and ranks properties that appear in city records (permits, violations, 911 calls) but have zero recorded fire inspections. Four classification types identify the specific bureaucratic failure mode (e.g., "Active permit, zero inspections" or "12 open violations, no follow up").

4. nfpa_response_compliance

Decision it drives: Which stations need more resources to meet their response time mandate?

Per station analysis of NFPA 1710 compliance (5 minute 20 second response target for 90% of calls). Cross references with the count of high risk properties in each station's coverage area to quantify the gap between risk exposure and response capability.

5. compliance_funnel

Decision it drives: Are we actually resolving the violations we cite?

Per district tracking of violation filing to closure pipeline. Computes closure rates and identifies districts with systemic backlogs of unresolved violations.

6. fairness_coverage

Decision it drives: Are inspection resources allocated proportionally to risk?

Compares the actual number of inspections each district receives against the inspection volume it should receive based on its share of citywide risk. Districts are flagged as UNDER_SERVED or OVER_SERVED with a coverage gap percentage.

7. model_health_drift

Decision it drives: Is our scoring model still accurate, or has the underlying data shifted?

Monitors the Population Stability Index (PSI) for all key features in the scoring model. A PSI above 0.25 indicates significant distribution drift, signaling that model retraining is needed. All features currently show PSI of 0.01 (STABLE).

8. dashboard_stats

Decision it drives: What are the headline numbers for a Fire Chief's daily briefing?

Pre computed summary statistics: total properties, critical count, dark property count, crisis zone count, average risk score, overall NFPA compliance rate, and station failure counts.

Technology Stack

Layer	Technology	Purpose
Data Governance	Databricks Unity Catalog	Schema management, access control, data lineage
Storage Format	Delta Lake	ACID transactions, time travel, schema enforcement
Compute	Databricks Serverless	Elastic, zero management notebook execution
Processing	PySpark SQL	Distributed transformation of 7.2M records
Spatial Indexing	Uber H3 (Databricks native)	Hexagonal geospatial indexing at ~175m resolution
ML Training	sklearn GradientBoostingClassifier	Gradient Boosted Trees on driver node (210K rows)
ML Explainability	SHAP TreeExplainer	Per property Shapley value attribution
Data Export	Databricks Volumes	Governed file storage for JSON frontend payload
Frontend	React + Vite + TypeScript	Interactive geospatial dashboard
Version Control	GitHub via Databricks Repos	CI/CD integration and reproducibility

Key Findings

Based on analysis of 210,359 properties across 5 municipal datasets:

18,065 properties (8.6%) are classified as CRITICAL risk under the ensemble model
15 Crisis Zones identified where high risk density, dark properties, and slow response times converge
District 10 is the most severely underserved district, receiving 75.9% fewer inspections than its risk volume warrants
293 properties were up tiered from HIGH to CRITICAL by the ML model, catching non linear risk patterns the heuristic alone would miss
All monitored features show PSI of 0.01 (STABLE), confirming the scoring model is not experiencing data drift

Getting Started

Prerequisites

Databricks workspace with Unity Catalog enabled
Serverless compute access
Raw CSV data files from the SF Open Data Portal placed in /Volumes/sf_fire_prod/bronze/raw_data/
Node.js 18+ for the frontend

Running the Pipeline

Clone the repository into Databricks Repos
Upload raw CSV files to the Bronze volume
Execute notebooks 01 through 99 in order (see pipeline flow table above)
Download exported JSON files from /Volumes/sf_fire_prod/gold/exports/

Running the Frontend

cd frontend/MIIT
npm install
npm install -g tsx
npm run dev

Place the 8 exported JSON files in frontend/MIIT/public/data/ before launching the development server.

Production Deployment Vision

The prototype architecture is designed for direct graduation to production on the Databricks platform:

Aspect	Prototype (Current)	Production
Data Ingestion	Manual CSV upload	Databricks Auto Loader with Socrata API integration
Processing Mode	Full overwrite	Incremental Delta Lake MERGE (upsert)
Orchestration	Manual notebook execution	Databricks Workflows DAG with dependency chaining
ML Lifecycle	Train and score in one notebook	MLflow Model Registry with human approval gates
Drift Detection	PSI printed to console	Automated Slack/PagerDuty alerts on PSI threshold breach
Serving Layer	Static JSON files	Databricks SQL Endpoint or Delta Sharing
Pipeline Runtime	35 to 55 minutes (full rebuild)	~90 seconds (incremental)
Monthly Cost	$0 (hackathon credits)	~$195/month (Serverless auto suspend)

For the complete production architecture including DAG design, MLflow integration, logging and observability plans, and cost analysis, see Part III of the Technical Documentation.

Repository Structure

HackBricks_P3/
    README.md                           This file
    TECHNICAL_DOCUMENTATION.md          Comprehensive technical documentation (3 viewpoints)
    context.md                          Project context and design decisions
    notebooks/
        01_bronze_ingest.py             Raw CSV to Delta ingestion
        02_silver_entity_resolution.py  Cross dataset entity resolution + H3 indexing
        03_silver_property_and_h3_features.py  Feature engineering
        04_silver_response_performance.py      Response time metrics
        05_gold_property_risk_twin.py   Heuristic risk scoring
        05b_gold_ml_risk_model.py       ML model + SHAP + ensemble scoring
        06_gold_h3_risk_surface.py      H3 hexagonal risk aggregation
        07_gold_dark_property_discovery.py  Dark property extraction
        08_gold_nfpa_response_compliance.py  NFPA compliance analysis
        09_gold_compliance_funnel.py    Violation resolution tracking
        10_gold_fairness_coverage.py    Inspection equity analysis
        11_gold_model_health_drift.py   Feature drift monitoring
        99_export_gold_to_json.py       JSON export for frontend
    frontend/
        MIIT/
            public/data/                JSON data files for the dashboard
            src/                        React/Vite application source code
            server.ts                   Development server
            vite.config.ts              Build configuration

Documentation

Document	Audience	Contents
Technical Documentation: Part I	IT/Tech Team	System architecture, pipeline execution, troubleshooting, schema reference
Technical Documentation: Part II	Firefighters and Inspectors	Risk tier explanations, dark property guide, dashboard usage
Technical Documentation: Part III	Engineers	Notebook deep dives, ML model details, tradeoffs, production architecture, logging plans

Design Decisions Summary

Decision	What We Chose	Why
Entity resolution	H3 + address grouping	Deterministic, fast, sufficient for macro analysis
Spatial indexing	H3 Resolution 9 (~175m)	Balances property granularity with statistical stability
ML framework	sklearn on driver	Enables SHAP TreeExplainer (incompatible with Spark MLlib)
Ensemble weight	90% heuristic / 10% ML	Prevents false positive saturation observed at 60/40
JSON sampling	Top 500 + 2% random	Preserves tier proportions while keeping payload manageable
Timestamp parsing	try_to_timestamp	Required for ANSI SQL enforcement on Serverless
Dark property logic	Rule based (4 types)	Explainability is paramount in government operations

Team

Team QTπs | HackBricks 2026

License

This project was developed as part of the HackBricks 2026 hackathon. All data sourced from the San Francisco Open Data Portal under open data license terms.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
frontend/MIIT		frontend/MIIT
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md
TECHNICAL_DOCUMENTATION.md		TECHNICAL_DOCUMENTATION.md
context.md		context.md
databricks-python-sdk.md		databricks-python-sdk.md
mcp_servers.json		mcp_servers.json

Folders and files

Latest commit

History

Repository files navigation

Pyxis

Project Overview

Platform Screenshots

Dashboard Overview

Geospatial Risk Prediction Layer

Response Optimisation Layer

NFPA Response Compliance Tracking system

Fairness Coverage Analysis

XAI and Shap Analysis

Architecture

Medallion Architecture

Data Pipeline Flow

The Scoring Engine

Hybrid Explainable AI Ensemble

Why 90/10 and Not 60/40?

Gold Layer Tables

1. property_risk_twin

2. h3_risk_surface

3. dark_property_discovery

4. nfpa_response_compliance

5. compliance_funnel

6. fairness_coverage

7. model_health_drift

8. dashboard_stats

Technology Stack

Key Findings

Getting Started

Prerequisites

Running the Pipeline

Running the Frontend

Production Deployment Vision

Repository Structure

Documentation

Design Decisions Summary

Team

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages