Skip to content

liamguest/AURA-ILE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AURA-ILE: Hurricane Vulnerability Assessment with Machine Learning

Integrated Learning Experience (ILE) Project Liam Guest | Tulane University School of Public Health and Tropical Medicine December 2025


Overview

This repository contains the complete analysis and paper for an explainable machine learning study on hurricane vulnerability assessment across Gulf Coast communities. The research predicts disaster assistance eligibility using social vulnerability indicators and evaluates systematic bias across demographic groups.

Key Findings:

  • Binary classification model: 72.8% accuracy, AUC-ROC=0.644
  • Systematic bias identified: 55-90% higher prediction errors in high-vulnerability communities
  • Three distinct vulnerability typologies requiring tailored interventions
  • Data leakage detection and correction demonstrates methodological rigor

Repository Structure

AURA-ILE/
├── paper/                          # Paper drafts and documentation
│   ├── ILE_DRAFT_1_UPDATED.md     # Main paper (Markdown)
│   ├── ILE_DRAFT_1_UPDATED.docx   # Main paper (Word)
│   ├── MODEL_COMPARISON_SUMMARY.md # Model evolution documentation
│   └── ILE_SUMMARY.md             # Technical project overview
├── data/
│   └── tract_storm_features.csv   # Dataset used for training (5,668 tract-disaster pairs)
├── scripts/
│   ├── train_binary_classification.py       # Final model training
│   ├── fairness_diagnostics.py              # Equity analysis
│   └── clustering_vulnerability_typologies.py # K-means clustering
├── results/
│   ├── binary_model_performance.csv         # Model metrics
│   ├── fairness_metrics.csv                 # Demographic disparities
│   ├── fairness_subgroup_performance.csv    # Intersectional analysis
│   ├── shap_feature_importance.csv          # SHAP results (leaky model)
│   └── shap_values.csv
└── figures/
    ├── fairness_error_by_group.png          # KEY: Demographic bias visualization
    ├── fairness_income_race_interaction.png # KEY: Intersectional analysis
    ├── clustering_typology_profiles.png     # KEY: Vulnerability typologies
    ├── clustering_pca_visualization.png
    ├── clustering_geographic_distribution.png
    └── clustering_elbow_silhouette.png

Research Summary

Problem

Current emergency management systems lack transparent, equitable, and predictive tools for hurricane vulnerability assessment, limiting proactive resource allocation to at-risk communities.

Approach

  • Data: 5,668 census tract-disaster observations (Harvey, Irma, Michael, Laura, Ida)
  • Models: Binary classification (KNN, Decision Tree, Random Forest, Bagging)
  • Features: 18 social vulnerability indicators (demographics, economics, housing)
  • Target: Whether a tract receives ANY disaster assistance (yes/no)

Key Results

  • Model Performance: 72.8% accuracy, AUC-ROC=0.644 (Bagging Classifier)
  • Fairness Concerns:
    • High-Black tracts: 90% higher prediction error
    • Low-income tracts: 55% higher prediction error
    • Intersectional vulnerability: 4-fold error disparity
  • Vulnerability Typologies: 3 clusters (62.5% moderate, 20.4% high, 17.1% low vulnerability)

Policy Implications

  • Mandate algorithmic fairness audits for disaster assistance systems
  • Proactive multilingual outreach to under-engaged communities
  • Tailored interventions based on vulnerability typologies
  • Community-based validation of algorithmic predictions

Key Documents

For Paper Review

  • Main Paper: paper/ILE_DRAFT_1_UPDATED.docx
  • Model Evolution: paper/MODEL_COMPARISON_SUMMARY.md (explains data leakage journey)

For Technical Details

  • Project Overview: paper/ILE_SUMMARY.md
  • Model Performance: results/binary_model_performance.csv
  • Fairness Analysis: results/fairness_metrics.csv

Essential Figures for Paper

  1. figures/fairness_error_by_group.png - Demographic bias
  2. figures/fairness_income_race_interaction.png - Intersectionality
  3. figures/clustering_typology_profiles.png - Vulnerability typologies

Dataset Description

File: data/tract_storm_features.csv

  • Observations: 5,668 census tract-disaster pairs
  • Hurricanes: Harvey (2017), Irma (2017), Michael (2018), Laura (2020), Ida (2021)
  • States: TX, LA, MS, AL, FL
  • Features: 18 social vulnerability indicators
  • Target: has_claims (binary: 1 if fema_claims_total > 0, else 0)
  • Class Distribution: 72.5% no claims, 27.5% has claims

Features

  • Economic: median_household_income, pct_poverty
  • Demographics: pct_elderly, pct_children, pct_black, pct_hispanic, pct_limited_english
  • Housing: pct_mobile_homes, pct_multi_unit, pct_crowded_housing, pct_no_vehicle
  • Context: population, housing_units, area_sq_mi, disaster number, state

Methods Overview

Binary Classification

  • Algorithms: KNN, Decision Tree, Random Forest, Bagging
  • Hyperparameter Tuning: GridSearchCV with 5-fold cross-validation
  • Train/Test Split: Stratified 80/20 (maintains class balance)
  • Best Model: Bagging Classifier (n_estimators=30, max_samples=1.0)

Fairness Diagnostics

  • Protected Attributes: Race (% Black), income, poverty, housing vulnerability
  • Metrics: MAE by group, mean prediction error, intersectional analysis
  • Threshold: MAE variation >20% indicates significant disparity

Clustering Analysis

  • Method: K-Means with silhouette score optimization
  • Features: 17 vulnerability indicators
  • Optimal k: 3 clusters
  • Output: Vulnerability typologies with distinct socioeconomic profiles

Reproducibility

Requirements

python 3.9+
pandas
numpy
scikit-learn
matplotlib
seaborn

Run Binary Classification

python scripts/train_binary_classification.py

Run Fairness Diagnostics

python scripts/fairness_diagnostics.py

Run Clustering Analysis

python scripts/clustering_vulnerability_typologies.py

MPH Competencies Addressed

Foundational

  1. Data Analysis: Binary classification, hyperparameter tuning, data leakage detection
  2. Policy Implications: Equity audits, multilingual outreach, community-based validation

Program-Specific

  1. Descriptive/Inferential Methods: Stratified sampling, cross-validation, fairness diagnostics
  2. Probability/Statistics: Zero-inflated distributions, ensemble methods, proper evaluation metrics

Citation

Guest, L. (2025). Explainable Machine Learning for Hurricane Vulnerability Assessment: Equity Implications in Disaster Assistance Prediction. Tulane University School of Public Health and Tropical Medicine, Integrated Learning Experience.


Contact

Liam Guest Tulane University School of Public Health and Tropical Medicine Email: lguest@tulane.edu


Acknowledgments

Data sources:

  • FEMA OpenFEMA: Individual Assistance Housing Registrants
  • CDC/ATSDR: Social Vulnerability Index 2022
  • U.S. Census Bureau: American Community Survey

This ILE project is part of the larger AURA (AI for Urban Resilience & Alerts) research initiative.

About

AURA, as a diagnostic model only. Classification model. SVI characteristics only, no NOAA data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages