This project is the capstone for MIT's AI Studio course. It explores predictive modeling using regression techniques on a real-world dataset to analyze key relationships and generate actionable insights.
- Build predictive models to estimate target variables using linear and random forest regression.
- Compare model performance to identify the best-fitting approach.
- Analyze feature importance to understand drivers of predictions.
- Develop clear visualizations and documentation for interpretability.
-
Data Preprocessing
- Cleaned and prepared dataset for modeling.
- Handled missing values and feature scaling where appropriate.
-
Modeling
- Implemented Linear Regression as a baseline model.
- Built and tuned a Random Forest Regression model for improved accuracy.
- Evaluated models using appropriate regression metrics.
-
Evaluation
- Compared model performance using metrics such as RMSE, MAE, and R².
- Analyzed feature importance from the Random Forest model.
- Both models captured meaningful patterns in the data.
- Random Forest Regression demonstrated better fit and predictive power compared to Linear Regression.
- Feature importance analysis highlighted the most influential variables driving predictions.
-
Clone the repository:
git clone https://github.com/taliakusmirek/MIT-finalproject.git cd MIT-finalproject -
Install dependencies:
pip install -r requirements.txt
-
Launch the Jupyter Notebook:
jupyter notebook
-
Open and run the notebook file to reproduce analysis and results.
A sample dataset is included in the data/ directory to enable testing and replication of results.
- Explore additional modeling techniques such as gradient boosting or neural networks.
- Implement cross-validation and hyperparameter tuning for more robust results.
- Develop an interactive dashboard to visualize predictions dynamically.
- Data preprocessing and feature engineering.
- Implementation and evaluation of regression models.
- Documentation and visualization of results.
- Repository organization and README creation.
For questions or collaboration, reach out via:
- Email: kusmirek@mit.edu
This project is licensed under the MIT License.