- 🔭 Currently building end-to-end analytics projects in GST fraud detection and railway operations data
- 🤝 Looking to collaborate on data analysis, business intelligence, SQL, or dashboard projects
- 🌱 Currently learning statistical modeling, data science, and business analytics
- 💬 Ask me about SQL window functions, anomaly detection, Power BI, Tableau, Python for data analysis
- ⚡ Fun fact: I built a 3-layer GST fraud detection system that flags 15% of 50,000 invoices — using pure SQL and statistics, no ML
Languages
Data & Analytics Libraries
Databases
BI & Visualization
Tools
End-to-end analytics system detecting fraudulent GST invoice patterns across 50,000+ records
- 3-layer detection pipeline — Rule-based validation → Statistical anomaly detection → Weighted risk scoring
- SQL window functions for Z-score analysis, rolling average spike detection, and IQR outlier detection
- Flags 7,512 invoices (15%) as suspicious, identifies 35 HIGH-risk vendors out of 210
- Interactive Tableau dashboard — 🔗 View Live
- Stack: Python · PostgreSQL · SQL · Pandas · Tableau
ML-powered personal finance analytics tool with category prediction, budget monitoring, and spending forecasting
- Naive Bayes + TF-IDF classifier predicts expense categories from transaction descriptions
- Month-over-month spending forecasting based on historical patterns to support budget planning
- Visual analytics dashboard — category breakdown (pie chart) and spending trend (line chart)
- Budget threshold monitoring with automated alerts when limits are exceeded
- Stack: Python · Scikit-learn · SQLite · Pandas
Automated data pipeline simulating real-time railway delays with risk classification
- APScheduler runs the pipeline every 5 minutes automatically
- Classifies delays into HIGH / MEDIUM / LOW risk tiers per train
- Dual storage — rolling CSV (last 10 records/train) + optional PostgreSQL
- Interactive Power BI dashboard with delay trends and risk distribution
- Stack: Python · Pandas · APScheduler · PostgreSQL · SQLAlchemy · Power BI
End-to-end business analytics on 4 years of retail sales data — from raw CSV to decision-ready Power BI dashboard
- Analyzed $2.30M in sales (2014–2017) with ~52% YoY growth across regions and categories
- Identified loss-making products (Tables & Bookcases) despite high sales volume using SQL queries
- Built KPI dashboard tracking Sales, Profit, Margin, and Quantity with region & category filters
- West region drives 31.58% of revenue — Technology leads in both sales and profitability
- Stack: Python · Pandas · SQLite · Power BI
⭐️ From Saksham3124