Magenta91/Fraud_Classified_model
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
Cyberfraud Classifier 🔒🤖 This project presents a Cyberfraud Classifier that classifies fraudulent activities in online environments using Natural Language Processing (NLP) and Deep Learning techniques. It was developed as part of an ML Project for SkillAI India to showcase the application of language models in cybersecurity. 🚀 Project Overview Cyberfraud detection is a critical task in today’s digital landscape where scams, phishing, and malicious activities are prevalent. Our goal was to build a model that can analyze text-based input and accurately classify it’s fraudulent nature and place them in correct label. 🧠 Model Used We used BERT (Bidirectional Encoder Representations from Transformers) — a state-of-the-art transformer-based model developed by Google. BERT has the ability to understand the context of words in both directions, making it particularly effective in analyzing the complex and often deceptive patterns of fraudulent text. Why BERT? Contextual Understanding: Unlike traditional models that interpret text in a single direction, BERT reads text bidirectionally, allowing deeper understanding of subtle clues in fraud-related messages.As this model was made to work under HINGLISH language conditions which is rather prevelant in the Sub-continent. Pre-trained Knowledge: BERT comes pre-trained on a massive corpus (Wikipedia + BooksCorpus), which helps in recognizing general patterns and context even with relatively smaller domain-specific datasets. Transfer Learning: Fine-tuning BERT on our dataset resulted in better generalization and superior performance compared to traditional ML models and basic deep learning architectures like RNNs or LSTMs. 📊 Performance The model achieved an accuracy of 85% on the test set, demonstrating its capability to effectively distinguish fraudulent text from legitimate communication.⚠️ Limitations Due to the large size of the fine-tuned BERT model, we were unable to upload the model files directly to the repository. However, the code to load, train, and evaluate the model is fully included and documented for reproducibility. 📂 Contents cyberfraud_classifier.ipynb: Jupyter Notebook containing full data processing, model training, and evaluation pipeline. requirements.txt: All dependencies needed to run the notebook. dataset/: (Reference to data used – upload instructions or sample, if applicable) 🛠️ How to Run Clone the repo. Install dependencies using pip install -r requirements.txt. Run the notebook in a GPU-enabled environment for optimal performance. Download a pre-trained BERT model via HuggingFace or TensorFlow Hub and fine-tune using the notebook.