Skip to content

Prajwal4581/AskMyDocs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

🤖 AskMyDocs

An AI-powered chatbot that answers questions from your PDF documents using RAG (Retrieval Augmented Generation) architecture, powered by LangChain + Groq LLaMA3 + Streamlit.

Upload any PDF → Ask questions → Get answers strictly from your documents.


🏗️ Architecture (RAG Pipeline)

Upload PDF from UI
      ↓
PyPDFLoader (LangChain) extracts text
      ↓
PDF content → Context string
      ↓
PromptTemplate (context + chat history + question)
      ↓
Groq LLaMA3-70b (LLM)
      ↓
StrOutputParser → Answer displayed in Streamlit UI

✨ Features

  • Upload multiple PDFs directly from the UI sidebar
  • Context-bound answers — LLM only uses PDF content
  • Chat history maintained across conversation
  • Cached LLM instance — fast responses
  • Clean Streamlit chat UI
  • Clear chat history button
  • Graceful error handling

🛠️ Tech Stack

Layer Technology
LLM Groq API (LLaMA3-70b-versatile)
RAG Framework LangChain
Document Loading PyPDFLoader + tempfile
Prompt LangChain PromptTemplate
Chain LCEL (LangChain Expression Language)
Frontend Streamlit

🚀 Setup & Run

1. Clone the repo

git clone https://github.com/Prajwal4581/AskMyDocs.git
cd AskMyDocs

2. Create virtual environment

python -m venv venv
venv\Scripts\activate   # Windows
source venv/bin/activate  # Mac/Linux

3. Install dependencies

pip install -r requirements.txt

4. Set up API key

cp .env.example .env
# Add your Groq API key in .env
GROQ_API_KEY=gsk_xxxxxxxxxxxxxxx

Get free Groq key → https://console.groq.com

5. Run the app

streamlit run app.py

6. Open browser → http://localhost:8501

  • Upload PDFs from sidebar
  • Ask questions in chat
  • Get answers from your documents 🎯

📁 Project Structure

AskMyDocs/
├── app.py              # Main Streamlit app + RAG chain
├── requirements.txt
├── .env.example
├── .gitignore
└── README.md

📸 Sample Usage

Question Source
"What are the tables in Residentia?" Residentia PDF → lists all 12 tables
"What is Spring Boot?" Java notes PDF → explains concept
"Summarize this document" Any PDF → gives summary

🔮 Production Improvements

  • Replace PDF concatenation with FAISS/ChromaDB vector store
  • Add RecursiveCharacterTextSplitter for large documents
  • Add semantic embeddings for better retrieval accuracy
  • Support DOCX, TXT file formats
  • Deploy on Streamlit Cloud with shareable URL
  • Add source citation (which page answered the question)

About

PDF-based RAG chatbot using LangChain + Groq LLaMA3. Upload any PDF and ask questions — answers strictly from document context. Built with Python AI/ML stack.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages