This repository contains three Jupyter notebooks covering core Machine Learning tasks: classification, clustering, and regression. Each notebook walks through data loading, preprocessing, modeling, and evaluation.
| Notebook | Task | What it does (high level) |
|---|---|---|
Classifier.ipynb |
Classification | Predicts whether a client will subscribe to a deposit product using a bank marketing dataset. Uses preprocessing + SMOTE for class imbalance and compares Decision Tree vs Random Forest (metrics + ROC/AUC). |
Finacial_Cluster.ipynb |
Clustering | Customer segmentation using financial features. Handles outliers, performs feature engineering, applies PCA, and compares KMeans vs DBSCAN using Silhouette and Davies–Bouldin. Saves trained models/scalers with joblib. |
Use Cars Regression.ipynb |
Regression | Predicts used car prices using a full ML pipeline (EDA → cleaning → feature engineering → encoding/scaling → train/test split → model training). Trains Linear Regression, ElasticNet, and Decision Tree Regressor with regression metrics (MAE/MSE/RMSE/R²). |
The notebooks expect these files:
bank.csv(used byClassifier.ipynb)Customer_Financial_Info.csv(used byFinacial_Cluster.ipynb)cars.csv(used byUse Cars Regression.ipynb)
Important: Datasets may not be included in this repository (especially large ones).
Place the CSV files in the project root (same folder as the notebooks), or update the paths inside the notebooks.
Use Cars Regression.ipynb was originally run on Google Colab and reads:
/content/drive/MyDrive/cars.csv
If you're running locally, change it to something like:
./cars.csv (or ./data/cars.csv).
python -m venv .venv
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activatepip install -U pip
pip install numpy pandas matplotlib seaborn scikit-learn imbalanced-learn joblib jupyterjupyter lab
# or
jupyter notebookOpen any notebook and run the cells top-to-bottom.
- Classification: prints Accuracy / Precision / Recall / F1, Confusion Matrix, and plots ROC curves with AUC.
- Clustering: evaluates cluster quality using Silhouette Score and Davies–Bouldin Index; exports models and scalers as
.pkl. - Regression: evaluates models using MAE, MSE, RMSE, and R².
DSC-309_Machine-Learning-master/
├── Classifier.ipynb
├── Finacial_Cluster.ipynb
└── Use Cars Regression.ipynb
(You can optionally add a data/ folder and move datasets there.)
This repository is intended for educational/coursework purposes.
If you are submitting this for a class, follow your course’s academic integrity rules.