This project focuses on the analysis of time-series sensor data for temperature and humidity, applying statistical methods, visualization techniques, and predictive modeling to extract meaningful insights. The dataset consists of sensor readings over time, allowing for trend exploration, anomaly detection, and machine learning applications.
- Data Cleaning & Preprocessing: Handling missing values, type conversions, and preparing the dataset for analysis.
- Exploratory Data Analysis (EDA): Generating insights through visualizations, including trend plots, histograms, and boxplots.
- Anomaly Detection: Identifying outliers using Z-score and IQR method.
- Rolling Statistics: Applying rolling mean and standard deviation for time-series smoothing.
- Predictive Modeling: Implementing Linear Regression to predict humidity based on temperature.
- Future Predictions: Estimating humidity values for unseen temperature data points.
The dataset contains time-series sensor readings recorded in an experimental setup. The primary columns include:
- Time β Timestamp of the recording.
- Temperature β Sensor-recorded temperature values.
- Humidity β Corresponding humidity measurements.
- Experiment ID β Identifier for experimental runs.
Stored in Datasets/Data_Experiment_1/pi2.xlsx.
- Data Cleaning β Convert types, handle missing values, and prepare for analysis.
- EDA & Visualization β Understand trends and distributions through visualizations.
- Anomaly Detection β Identify and visualize outliers in temperature and humidity.
- Rolling Statistics β Compute rolling means and standard deviations for time-series insights.
- Machine Learning Model β Train a Linear Regression model to predict humidity from temperature.
- Future Predictions β Extend the model to forecast humidity for new temperature values.
To run this project locally:
# Clone the repository
git clone https://github.com/YOUR_GITHUB_USERNAME/temperature-humidity-analysis.git
cd temperature-humidity-analysis
# Install dependencies
pip install -r requirements.txt
# Run the Jupyter Notebook
jupyter notebook- π Dataset Source: UCI Machine Learning Repository
- π Documentation on pandas: pandas.pydata.org
- π Seaborn for Data Visualization: seaborn.pydata.org
- π€ Scikit-learn for Machine Learning: scikit-learn.org
- Expanding ML models β Experimenting with advanced time-series forecasting techniques.
- Interactive Dashboards β Developing a Streamlit app for real-time data visualization.
- Sensor Integration β Testing the model with live IoT sensor data.
This project is licensed under the MIT License. The dataset is sourced from the UCI Machine Learning Repository and is available for research purposes.
π’ This repository demonstrates strong analytical and technical skills in data science, visualization, and machine learning.



