Exploratory data analysis projects using Python. Covers data cleaning, feature engineering, statistical analysis, and insight extraction from real-world datasets.
- Python
- Pandas
- NumPy
- Matplotlib
- Seaborn
- Scikit-learn (preprocessing and feature engineering)
- Jupyter Notebook
Data Cleaning
- Handling missing values
- Removing duplicates
- Fixing inconsistent data types
- Outlier detection and treatment
Exploratory Data Analysis (EDA)
- Univariate and bivariate analysis
- Distribution analysis
- Correlation analysis
- Statistical summaries
Feature Engineering
- Creating new features from existing data
- Encoding categorical variables
- Feature scaling and normalization
- Handling skewed distributions
- Binning and discretization
Data Visualization
- Bar charts, histograms, box plots
- Heatmaps and correlation matrices
- Pair plots and scatter plots
- Time series plots
data-analysis/
project-1/
dataset/
notebook.ipynb
README.md
project-2/
dataset/
notebook.ipynb
README.md
...
- Clone the repository
git clone https://github.com/your-username/data-analysis.git- Install dependencies
pip install pandas numpy matplotlib seaborn scikit-learn jupyter- Open any project notebook
jupyter notebookArifudheen M GitHub: Arif-1411