Skip to content

AMRYB/DSC-309_Machine-Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

DSC-309 Machine Learning — Coursework Notebooks

This repository contains three Jupyter notebooks covering core Machine Learning tasks: classification, clustering, and regression. Each notebook walks through data loading, preprocessing, modeling, and evaluation.


Project contents

Notebook Task What it does (high level)
Classifier.ipynb Classification Predicts whether a client will subscribe to a deposit product using a bank marketing dataset. Uses preprocessing + SMOTE for class imbalance and compares Decision Tree vs Random Forest (metrics + ROC/AUC).
Finacial_Cluster.ipynb Clustering Customer segmentation using financial features. Handles outliers, performs feature engineering, applies PCA, and compares KMeans vs DBSCAN using Silhouette and Davies–Bouldin. Saves trained models/scalers with joblib.
Use Cars Regression.ipynb Regression Predicts used car prices using a full ML pipeline (EDA → cleaning → feature engineering → encoding/scaling → train/test split → model training). Trains Linear Regression, ElasticNet, and Decision Tree Regressor with regression metrics (MAE/MSE/RMSE/R²).

Datasets

The notebooks expect these files:

  • bank.csv (used by Classifier.ipynb)
  • Customer_Financial_Info.csv (used by Finacial_Cluster.ipynb)
  • cars.csv (used by Use Cars Regression.ipynb)

Important: Datasets may not be included in this repository (especially large ones).
Place the CSV files in the project root (same folder as the notebooks), or update the paths inside the notebooks.

Use Cars Regression.ipynb was originally run on Google Colab and reads: /content/drive/MyDrive/cars.csv
If you're running locally, change it to something like: ./cars.csv (or ./data/cars.csv).


How to run

1) Create a virtual environment (recommended)

python -m venv .venv
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activate

2) Install dependencies

pip install -U pip
pip install numpy pandas matplotlib seaborn scikit-learn imbalanced-learn joblib jupyter

3) Launch Jupyter

jupyter lab
# or
jupyter notebook

Open any notebook and run the cells top-to-bottom.


Notes on evaluation

  • Classification: prints Accuracy / Precision / Recall / F1, Confusion Matrix, and plots ROC curves with AUC.
  • Clustering: evaluates cluster quality using Silhouette Score and Davies–Bouldin Index; exports models and scalers as .pkl.
  • Regression: evaluates models using MAE, MSE, RMSE, and R².

Repository structure

DSC-309_Machine-Learning-master/
├── Classifier.ipynb
├── Finacial_Cluster.ipynb
└── Use Cars Regression.ipynb

(You can optionally add a data/ folder and move datasets there.)


Academic note

This repository is intended for educational/coursework purposes.
If you are submitting this for a class, follow your course’s academic integrity rules.

About

DSC-309 Machine Learning coursework — 3 Jupyter notebooks (classification, clustering, regression).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors