Skip to content

razancodes/Pyxis-hackbricks

Repository files navigation

Pyxis

Fire Risk Prevention and Response Optimization Engine City of San Francisco Fire Department Operations

Built on the Databricks Lakehouse Platform | April 2026


Project Overview

Pyxis is a production grade data intelligence platform that transforms 7.2 million raw municipal records from the San Francisco Open Data Portal into actionable fire risk analytics. The platform ingests five distinct city datasets (fire incidents, 911 calls, fire violations, building inspections, and building permits), resolves them into a unified property registry of 210,359 buildings, and scores every property in the city on a 0 to 100 risk scale using a hybrid Explainable AI ensemble.

The system directly addresses four critical operational needs of the San Francisco Fire Department:

  1. Risk Prediction: Identifying which specific properties and neighborhoods face the highest fire risk before incidents occur
  2. Response Optimization: Quantifying station level NFPA 1710 compliance and recommending resource reallocation based on geospatial risk density
  3. Compliance Tracking: Measuring violation follow up effectiveness and exposing "dark properties" that have slipped through every inspection queue
  4. Operational Intelligence: Delivering pre computed, decision ready metrics through a real time geospatial dashboard

For complete technical details, refer to the Technical Documentation.


Platform Screenshots

Dashboard Overview

image

Geospatial Risk Prediction Layer

image

Response Optimisation Layer

image

NFPA Response Compliance Tracking system

image

Fairness Coverage Analysis

image

XAI and Shap Analysis

image

Architecture

Medallion Architecture

Pyxis implements the Databricks Medallion Architecture pattern with three data layers:

                    San Francisco Open Data Portal
                    (5 Datasets | 7.2M Records)
                              |
                              v
    +-------------------------------------------------------+
    |                    BRONZE LAYER                        |
    |  Raw CSV ingestion with metadata tagging               |
    |  5 Delta tables preserving original data fidelity      |
    +-------------------------------------------------------+
                              |
                              v
    +-------------------------------------------------------+
    |                    SILVER LAYER                        |
    |  Entity resolution across 5 datasets via H3 indexing   |
    |  Feature engineering: per property + per hexagon        |
    |  Response time computation per station                  |
    |  210,359 unified property entities                      |
    +-------------------------------------------------------+
                              |
                              v
    +-------------------------------------------------------+
    |                     GOLD LAYER                         |
    |  8 business ready analytical tables                     |
    |  Heuristic risk scoring (0 to 100, 6 components)       |
    |  ML ensemble scoring (sklearn GBT + SHAP)              |
    |  Crisis zone mapping, dark property flagging            |
    |  NFPA compliance, fairness analysis, drift monitoring   |
    +-------------------------------------------------------+
                              |
                              v
    +-------------------------------------------------------+
    |                  PRESENTATION LAYER                    |
    |  JSON export via Databricks Volumes                     |
    |  React/Vite geospatial dashboard                       |
    +-------------------------------------------------------+

Data Pipeline Flow

The pipeline consists of 13 Databricks notebooks executed in strict dependency order:

Stage Notebook Purpose Output
Bronze 01_bronze_ingest Raw CSV to Delta conversion 5 Bronze tables
Silver 02_silver_entity_resolution Cross dataset entity resolution with H3 spatial indexing address_entity_master
Silver 03_silver_property_and_h3_features Per property and per hexagon feature engineering property_features, h3_features
Silver 04_silver_response_performance Station level response time metrics response_performance
Gold 05_gold_property_risk_twin Heuristic risk scoring with explainable factor decomposition property_risk_twin
Gold 05b_gold_ml_risk_model ML model training, SHAP analysis, and ensemble scoring property_risk_twin (updated)
Gold 06_gold_h3_risk_surface H3 hexagonal risk aggregation and crisis zone flagging h3_risk_surface
Gold 07_gold_dark_property_discovery Dark property extraction and ranking dark_property_discovery
Gold 08_gold_nfpa_response_compliance Per station NFPA 1710 compliance with risk exposure nfpa_response_compliance
Gold 09_gold_compliance_funnel Violation resolution tracking by district compliance_funnel
Gold 10_gold_fairness_coverage Inspection equity analysis across districts fairness_coverage
Gold 11_gold_model_health_drift Feature distribution drift monitoring via PSI model_health_drift
Export 99_export_gold_to_json Hybrid sampled JSON export for frontend 8 JSON files

The Scoring Engine

Hybrid Explainable AI Ensemble

Pyxis uses a deliberate two layer scoring architecture designed for legal auditability in government operations.

Layer 1: Transparent Heuristic (90% of final score)

Every property receives a deterministic score from 0 to 100 based on six verifiable components:

Component Max Points Signal
Violations 30 Open and severe fire code violations on the property
Incidents 25 Structure fires and emergency incidents in the past 3 years
Inspections 25 Time elapsed since last inspection (exponential decay function)
Call Frequency 10 911 call volume in the surrounding area over 12 months
Permit Risk 10 Building permit type indicating high risk occupancy
Dark Penalty 8 Property has never been inspected despite active records

Each score is accompanied by three human readable explanation lines (e.g., "21 open violations unresolved", "1,235 days since last inspection") that provide legal justification for inspection prioritization.

Layer 2: Machine Learning Refinement (10% of final score)

An sklearn Gradient Boosted Tree classifier trained on 210,000 properties predicts structure fire probability using 8 non leaking features. SHAP TreeExplainer provides per property feature attribution, showing exactly which factors the model weighted most heavily.

Ensemble Formula: ensemble_risk_score = 0.9 * heuristic_score + 0.1 * ml_probability * 100

Why 90/10 and Not 60/40?

During development, we tested a 60/40 ensemble weighting. The ML model produced near universal high fire probabilities, reclassifying 83.3% of all properties as CRITICAL (compared to 1.6% under the pure heuristic). This is the exact false positive saturation phenomenon documented in the failure of LAPD PredPol (scrapped 2020) and NYPD COMPAS predictive systems. The ML model was detecting "data density" (properties with more city paperwork) rather than genuine fire physics.

We deliberately reweighted to 90/10, positioning the ML as a precision tie breaker rather than a primary classifier. This reduced CRITICAL to 8.6% while the ML still meaningfully up tiered 293 borderline properties from HIGH to CRITICAL by detecting non linear interaction patterns invisible to the static heuristic.

Tier Heuristic Only 60/40 Ensemble (Rejected) 90/10 Ensemble (Final)
CRITICAL 3,408 (1.6%) 175,209 (83.3%) 18,065 (8.6%)
HIGH 185,307 (88.1%) 32,169 (15.3%) 171,912 (81.7%)
MEDIUM 21,240 (10.1%) 2,279 (1.1%) 20,114 (9.6%)
LOW 404 (0.2%) 702 (0.3%) 268 (0.1%)

Gold Layer Tables

The Gold layer contains 8 optimized Delta tables, each designed to power a specific operational decision:

1. property_risk_twin

Decision it drives: Which building should an inspector visit today, and why?

One row per property in San Francisco (210,359 total). Contains the heuristic score, ML fire probability, ensemble score, three heuristic explanation lines, three SHAP driver attributions, dark property flags, and a recommended action string. This is the primary table powering the geospatial dashboard.

2. h3_risk_surface

Decision it drives: Where should idle fire trucks pre position for fastest crisis response?

One row per H3 resolution 9 hexagon (~175 meter cells, ~1,200 total). Aggregates property risk scores, critical property counts, dark property density, and NFPA compliance rates per hexagon. Cells meeting multi factor crisis thresholds are flagged as Crisis Zones.

3. dark_property_discovery

Decision it drives: Which buildings are invisible to the inspection system?

Extracts and ranks properties that appear in city records (permits, violations, 911 calls) but have zero recorded fire inspections. Four classification types identify the specific bureaucratic failure mode (e.g., "Active permit, zero inspections" or "12 open violations, no follow up").

4. nfpa_response_compliance

Decision it drives: Which stations need more resources to meet their response time mandate?

Per station analysis of NFPA 1710 compliance (5 minute 20 second response target for 90% of calls). Cross references with the count of high risk properties in each station's coverage area to quantify the gap between risk exposure and response capability.

5. compliance_funnel

Decision it drives: Are we actually resolving the violations we cite?

Per district tracking of violation filing to closure pipeline. Computes closure rates and identifies districts with systemic backlogs of unresolved violations.

6. fairness_coverage

Decision it drives: Are inspection resources allocated proportionally to risk?

Compares the actual number of inspections each district receives against the inspection volume it should receive based on its share of citywide risk. Districts are flagged as UNDER_SERVED or OVER_SERVED with a coverage gap percentage.

7. model_health_drift

Decision it drives: Is our scoring model still accurate, or has the underlying data shifted?

Monitors the Population Stability Index (PSI) for all key features in the scoring model. A PSI above 0.25 indicates significant distribution drift, signaling that model retraining is needed. All features currently show PSI of 0.01 (STABLE).

8. dashboard_stats

Decision it drives: What are the headline numbers for a Fire Chief's daily briefing?

Pre computed summary statistics: total properties, critical count, dark property count, crisis zone count, average risk score, overall NFPA compliance rate, and station failure counts.


Technology Stack

Layer Technology Purpose
Data Governance Databricks Unity Catalog Schema management, access control, data lineage
Storage Format Delta Lake ACID transactions, time travel, schema enforcement
Compute Databricks Serverless Elastic, zero management notebook execution
Processing PySpark SQL Distributed transformation of 7.2M records
Spatial Indexing Uber H3 (Databricks native) Hexagonal geospatial indexing at ~175m resolution
ML Training sklearn GradientBoostingClassifier Gradient Boosted Trees on driver node (210K rows)
ML Explainability SHAP TreeExplainer Per property Shapley value attribution
Data Export Databricks Volumes Governed file storage for JSON frontend payload
Frontend React + Vite + TypeScript Interactive geospatial dashboard
Version Control GitHub via Databricks Repos CI/CD integration and reproducibility

Key Findings

Based on analysis of 210,359 properties across 5 municipal datasets:

  • 18,065 properties (8.6%) are classified as CRITICAL risk under the ensemble model
  • 15 Crisis Zones identified where high risk density, dark properties, and slow response times converge
  • District 10 is the most severely underserved district, receiving 75.9% fewer inspections than its risk volume warrants
  • 293 properties were up tiered from HIGH to CRITICAL by the ML model, catching non linear risk patterns the heuristic alone would miss
  • All monitored features show PSI of 0.01 (STABLE), confirming the scoring model is not experiencing data drift

Getting Started

Prerequisites

  • Databricks workspace with Unity Catalog enabled
  • Serverless compute access
  • Raw CSV data files from the SF Open Data Portal placed in /Volumes/sf_fire_prod/bronze/raw_data/
  • Node.js 18+ for the frontend

Running the Pipeline

  1. Clone the repository into Databricks Repos
  2. Upload raw CSV files to the Bronze volume
  3. Execute notebooks 01 through 99 in order (see pipeline flow table above)
  4. Download exported JSON files from /Volumes/sf_fire_prod/gold/exports/

Running the Frontend

cd frontend/MIIT
npm install
npm install -g tsx
npm run dev

Place the 8 exported JSON files in frontend/MIIT/public/data/ before launching the development server.


Production Deployment Vision

The prototype architecture is designed for direct graduation to production on the Databricks platform:

Aspect Prototype (Current) Production
Data Ingestion Manual CSV upload Databricks Auto Loader with Socrata API integration
Processing Mode Full overwrite Incremental Delta Lake MERGE (upsert)
Orchestration Manual notebook execution Databricks Workflows DAG with dependency chaining
ML Lifecycle Train and score in one notebook MLflow Model Registry with human approval gates
Drift Detection PSI printed to console Automated Slack/PagerDuty alerts on PSI threshold breach
Serving Layer Static JSON files Databricks SQL Endpoint or Delta Sharing
Pipeline Runtime 35 to 55 minutes (full rebuild) ~90 seconds (incremental)
Monthly Cost $0 (hackathon credits) ~$195/month (Serverless auto suspend)

For the complete production architecture including DAG design, MLflow integration, logging and observability plans, and cost analysis, see Part III of the Technical Documentation.


Repository Structure

HackBricks_P3/
    README.md                           This file
    TECHNICAL_DOCUMENTATION.md          Comprehensive technical documentation (3 viewpoints)
    context.md                          Project context and design decisions
    notebooks/
        01_bronze_ingest.py             Raw CSV to Delta ingestion
        02_silver_entity_resolution.py  Cross dataset entity resolution + H3 indexing
        03_silver_property_and_h3_features.py  Feature engineering
        04_silver_response_performance.py      Response time metrics
        05_gold_property_risk_twin.py   Heuristic risk scoring
        05b_gold_ml_risk_model.py       ML model + SHAP + ensemble scoring
        06_gold_h3_risk_surface.py      H3 hexagonal risk aggregation
        07_gold_dark_property_discovery.py  Dark property extraction
        08_gold_nfpa_response_compliance.py  NFPA compliance analysis
        09_gold_compliance_funnel.py    Violation resolution tracking
        10_gold_fairness_coverage.py    Inspection equity analysis
        11_gold_model_health_drift.py   Feature drift monitoring
        99_export_gold_to_json.py       JSON export for frontend
    frontend/
        MIIT/
            public/data/                JSON data files for the dashboard
            src/                        React/Vite application source code
            server.ts                   Development server
            vite.config.ts              Build configuration

Documentation

Document Audience Contents
Technical Documentation: Part I IT/Tech Team System architecture, pipeline execution, troubleshooting, schema reference
Technical Documentation: Part II Firefighters and Inspectors Risk tier explanations, dark property guide, dashboard usage
Technical Documentation: Part III Engineers Notebook deep dives, ML model details, tradeoffs, production architecture, logging plans

Design Decisions Summary

Decision What We Chose Why
Entity resolution H3 + address grouping Deterministic, fast, sufficient for macro analysis
Spatial indexing H3 Resolution 9 (~175m) Balances property granularity with statistical stability
ML framework sklearn on driver Enables SHAP TreeExplainer (incompatible with Spark MLlib)
Ensemble weight 90% heuristic / 10% ML Prevents false positive saturation observed at 60/40
JSON sampling Top 500 + 2% random Preserves tier proportions while keeping payload manageable
Timestamp parsing try_to_timestamp Required for ANSI SQL enforcement on Serverless
Dark property logic Rule based (4 types) Explainability is paramount in government operations

Team

Team QTπs | HackBricks 2026


License

This project was developed as part of the HackBricks 2026 hackathon. All data sourced from the San Francisco Open Data Portal under open data license terms.

About

This Repository contains Detailed information regarding Pyxis - Explainable Fire Risk intelligence.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors