GitHub - SatChittAnand/Twitter-Sentiment-Analysis: This project performs sentiment analysis on Twitter data to classify tweets as positive or negative.

Twitter Sentiment Analysis Project

A modular machine learning pipeline for classifying tweet sentiments as positive or negative, using robust text preprocessing, feature engineering, and model comparison. This project demonstrates practical NLP techniques and evaluates multiple classifiers to identify the most effective model for real-world sentiment prediction.

🧠 Objectives

Clean and preprocess raw Twitter data for sentiment classification.
Extract meaningful features using Bag-of-Words (BoW) and TF-IDF.
Train and evaluate multiple ML models to identify the best performer.
Visualize insights from the data (hashtags, word clouds).
Generate predictions on unseen test data using the top-performing model.

⚙️ Workflow Overview

graph TD
A[Load Data] --> B[Preprocess Tweets]
B --> C[Feature Extraction: CBoW & TF-IDF]
C --> D[Train Models: LogReg, XGBoost, Decision Tree]
D --> E[Evaluate via F1-Score]
E --> F[Visualize Results]
F --> G[Predict on Test Data]

📦 Requirements

Install dependencies using:

pip install -r requirements.txt

Also download NLTK tokenizer:

import nltk
nltk.download('punkt')

`requirements.txt` includes:

pandas
numpy
nltk
scikit-learn
xgboost
matplotlib
seaborn
wordcloud

🚀 How to Run

Clone this repository:

git clone https://github.com/SatChittAnand/Twitter-Sentiment-Analysis.git
cd Twitter-Sentiment-Analysis

Place the datasets train_SentimentAnalysis.csv and test_SentimentAnalysis.csv in the root directory.
Run the script:
```
python sentimentanalysistwitter.py
```

The script will:

Preprocess and vectorize the data
Train and evaluate models
Generate visualizations
Save predictions to predictions.csv

🤖 Models Implemented

Model	Feature Technique	Evaluation Metric
Logistic Regression	BoW, TF-IDF	F1-Score
XGBoost Classifier	BoW, TF-IDF	F1-Score
Decision Tree	BoW, TF-IDF	F1-Score

📊 Results

Logistic Regression with TF-IDF achieved the highest F1-score.
Visual comparisons via point plots highlight model-feature performance trade-offs.
Word clouds and hashtag frequency plots offer intuitive insights into tweet content.

📁 Project Structure

twitter-sentiment-analysis/
│
├── sentimentanalysistwitter.py       # Main script
├── train_SentimentAnalysis.csv       # Training dataset
├── test_SentimentAnalysis.csv        # Test dataset
├── predictions.csv                   # Output predictions
├── requirements.txt                  # Dependencies
└── README.md                         # Project documentation

🌟 Highlights

✅ Modular design for easy extension
📈 Visual insights into tweet content and trends
🔁 Reproducible and scalable for larger datasets
🧩 Easy to integrate into real-time sentiment monitoring systems

🤝 Contributing

Pull requests are welcome! For major changes, please open an issue first to discuss what you’d like to modify.

📜 License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
LICENSE		LICENSE
README.md		README.md
SenimentAnalysisTwitter.ipynb		SenimentAnalysisTwitter.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 Objectives

⚙️ Workflow Overview

📦 Requirements

`requirements.txt` includes:

🚀 How to Run

🤖 Models Implemented

📊 Results

📁 Project Structure

🌟 Highlights

🤝 Contributing

📜 License

🙌 Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 Objectives

⚙️ Workflow Overview

📦 Requirements

requirements.txt includes:

🚀 How to Run

🤖 Models Implemented

📊 Results

📁 Project Structure

🌟 Highlights

🤝 Contributing

📜 License

🙌 Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`requirements.txt` includes:

Packages