This project contains the end-to-end data pipeline for the Analytics Engineer task - to measure driver engagement. It covers everything from initial Snowflake setup to final reporting models.
- Appendix AE.pdf: Start here for a detailed technical overview of the logic, data quality decisions, and architectural choices.
- setup.sql: The entry point for the project. Run this first to set up the Snowflake environment (databases, schemas, and stages).
- models/: Contains the core transformation logic divided into three layers:
- /staging: Initial cleaning and data quality flagging (casting types, handling "ghost drivers").
- /marts: Final dimensional models (Facts and Dimensions) ready for BI tools.
- /analyses: Ad-hoc SQL queries and the Exploratory Data Analysis (EDA) notebook used to uncover trends.
- Run the
setup.sqlscript in your Snowflake console. - Follow the loading order described in the Appendix: Drivers -> Bookings -> Offers.
- The final aggregated metrics can be found in the
agg_driver_activity.sqlmodel within the marts folder.