I build end-to-end data pipelines and analytical systems using Python and SQL, with a focus on data modeling and orchestration.
I particularly enjoy analytics pipelines and tools around things I find worth tracking, viz. cricket and music, mainly.
End-to-end ELT data pipeline for ball-by-ball cricket match data using Python, PostgreSQL, dbt, and Airflow.
Includes ingestion, incremental loading, layered data modeling, and fully orchestrated transformations via Airflow (Astronomer Cosmos).
🔗 https://github.com/shsiddhant/cricket-warehouse
Python library, CLI tool, and dashboard for exploring music listening history from Last.fm and Spotify.
Focuses on temporal patterns such as attachment, repetition, and listening streaks.
🔗 https://github.com/shsiddhant/memory.fm
Machine learning project predicting match outcomes using features engineered from historical match data.
🔗 https://github.com/shsiddhant/womens-wc
Lightweight offline journaling application with password protection and Markdown support.
🔗 https://github.com/shsiddhant/memory.journal
Python • SQL • PostgreSQL • dbt • Airflow • Docker • Pandas • NumPy • Git
- Data engineering
- Data warehouses and analytics pipelines
- Sports analytics
- Personal data exploration tools
- Python-based CLI tools
- Improving pipeline design and orchestration patterns.
- Performance optimization for data processing workflows.
- T20 Cricket analytics