Skip to content

SumitM0432/Quora-Insincere-Questions-Classification

Repository files navigation

Quora-Insincere-Questions-Classification

Insincere questions are defined as questions that are unethical and have a disparaging tone. These types of questions are rather intended to make an inappropriate statement or comment than to search for a helpful and beneficent solution or answer. These insincere questions are classified using machine learning and Transformers. The dataset used is from Quora, a question-answer forum containing the questions asked by the users. Several models are trained, Naive Bayes and Logistic Regression showing traditional machine learning methods, then Convolutional Neural Network and BERT language model representing some advanced methods.

Exploratory Data Analysis is done for the insights for the methodology and the preprocessed using various NLP techniques then Stanford GloVe embedding is used to increase the vocabulary coverage to see its effect on the model performance and

The details are given below:

Dataset - Kaggle Quora Dataset

Published Research Paper - Insincere Questions Classification Using CNN with Increased Vocabulary Coverage of GloVe Embedding

Kaggle Notebook : Notebook

About

This project focuses on classifying insincere questions using machine learning and transformer models. Advanced NLP preprocessing techniques were applied to prepare the data, followed by the integration of Stanford GloVe embeddings to increase vocabulary coverage. The models were then trained and evaluated to ensure robust performance.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors