Based on levels of certain vitamins, hormone levels and physical activities, predict if a woman may have PCOS
- Analyse the PCOS dataset [https://www.kaggle.com/datasets/prasoonkottarathil/polycystic-ovary-syndrome-pcos] ✔️
- Perform Binary Classification task using following ML Algorithms: Logistic Regression, SVM, Decision Trees, Random Forest, Naive Bayes and K-Nearest Neighbours ✔️
- Build Neural Network Model to perform Classification task. ✔️
- Using the trained Neural Model deploy it using Web Interface. ⏳
-
PCOS Women have a majority of Cycle (R/I) based around 4 days, while non-PCOS Women have a majority of Cycle (R/I) based around 2 days
-
Cycle Lengths among PCOS women have been recorded to be around 2-8 days.
-
The age of PCOS Women range from 20-45 yrs, with majority cases between 25-30 yrs.
-
Non-PCOS Women have follicle lengths centered around 5 units, while PCOS Women have their follicle lengths centered around 10 units of measurement.
-
PCOS Women have a FSH/LH ratio between -5 to 35, while non-PCOS women have between 0-59.
-
PCOS Women have Vit D3 (ng/mL) levels between -100 to 90 as compared to non-PCOS Women whose Vit D3 (ng/mL) levels are centered around 10 to 70.
-
PCOS Women have a PRG(ng/mL) level between 0 to 1, while non-PCOS women have PRG levels between -0.5 to 1.5
-
PCOS Women also experience weight gain, hair-growth, skin-darkening, hair-loss, pimples and craving for fast-foods after developing PCOS.
-
Out of the 6 algorithms used, Naive Bayes performed the best with the highest accuracy (83.7%) and highest precision(94.6%) in comparison to other models.
Models Accuracy Precision Recall Logistic Regression 0.77030.51780.8787Support Vector Machines 0.66660.39280.6666Decision Trees 0.77770.62500.7954Random Forest 0.82220.60710.9444Naive Bayes 0.83700.94640.7361K-Nearest Neighbours 0.60000.19640.5500 -
After feature selection using Wrapper method (Backward Elimination), 18 best features were selected for training with Naive Bayes model which led to the following results:
Metric Score Accuracy 0.851852Precision 0.892857Recall 0.78125
Constructed a Neural Network model with PyTorch to classify features into PCOS and Non-PCOS categories.
The following is the architecture of the Neural Model:

The Test Results obtained after the result are as follows:
| Class (Yes: 1 / No: 0) | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| 0 | 0.94 |
0.96 |
0.95 |
106 |
| 1 | 0.92 |
0.88 |
0.90 |
56 |
| Accuracy | - | - | 0.93 | 162 |
(work in progress)