law-library: Indian Law Q&A Chatbot

A Question-Answering chatbot specialized in Indian law, built using RAG (Retrieval-Augmented Generation) architecture with LangChain and Streamlit.

🚀 Features

Comprehensive Legal Knowledge: Works with any Indian law documents you provide (PDFs)
RAG Architecture: Combines retrieval-based search with generative AI for accurate, context-aware responses
Interactive Web Interface: User-friendly Streamlit-based chat interface
Vector Database: Efficient document retrieval using FAISS vector store
Specialized Legal Training: Custom prompt engineering for legal domain expertise
Flexible Document Support: Add your own legal PDFs to customize the knowledge base

📚 Legal Document Coverage

The chatbot uses a Retrieval-Augmented Generation (RAG) approach to answer questions based on legal documents you provide.

� How to Add Your Legal Documents

This project does not include pre-loaded legal documents due to copyright restrictions. You need to add your own legal PDFs to the law-library/dataset/ directory.

📜 Constitutional & Administrative Law

The Constitution of India
Judicial review and constitutional amendments
Administrative law and governance documents

⚖️ Criminal & Evidence Law

The Indian Penal Code (IPC)
Criminal Procedure Code (CrPC)
Indian Evidence Act, 1872

💼 Corporate & Business Law

Companies Act
Contract Act
Partnership and business regulations

👨‍💼 Labour & Employment Law

Industrial Disputes Act
Minimum Wages Act
Employee State Insurance Act
Shops and Establishments Act

🌐 Specialized Legal Areas

Information Technology Act (Cyber Laws)
Banking Regulation Act
Consumer Protection Act
Intellectual Property laws

� Where to Find Legal Documents

You can obtain legal documents from these free and legal sources:

Government Websites:
- India Code - Official government portal for all Indian laws
- Ministry of Law and Justice - Official legal resources
- Law Commission of India - Reports and recommendations
Public Domain Sources:
- Bare Acts from India Code
- State government legal websites
- Public legal databases and repositories
Academic Institutions:
- Law school libraries (open access materials)
- University legal research repositories

⚖️ Copyright Notice

IMPORTANT:

Only use documents you have the legal right to use
Respect copyright and intellectual property laws
Use only public domain or openly licensed legal texts
For copyrighted materials, obtain proper permissions

💡 Note: Place all your legal PDF files in the law-library/dataset/ directory before running the ingestion script.

🛠️ Technology Stack

Language Model: Google FLAN-T5-Base (free, open-access model)
Framework: LangChain for RAG pipeline
Vector Store: FAISS for efficient similarity search
Embeddings: HuggingFace Sentence Transformers
Frontend: Streamlit for web interface
Document Processing: PyPDF processing for legal documents

📁 Project Structure

law-library/
├── README.md
├── requirements.txt
└── law-library/
    ├── app.py              # Streamlit web application
    ├── ingest.py           # Document processing and vectorization
    ├── utils.py            # Core RAG pipeline and utilities
    ├── dataset/            # Legal PDF documents (13 files)
    └── vectorstore/        # FAISS vector database
        ├── index.faiss
        └── index.pkl

🚀 Quick Start

Prerequisites

Before you begin, ensure you have the following:

✅ Python 3.8+ installed on your system
✅ pip (Python package manager)
✅ 8GB+ RAM available
✅ 10GB+ free disk space for models and documents
✅ (Optional) CUDA-compatible GPU for faster inference

Step-by-Step Installation

Step 1: Clone the Repository

git clone https://github.com/bPavan16/law-Rag.git
cd law-Rag

Step 2: Create a Virtual Environment (Recommended)

# Create a virtual environment
python -m venv .venv

# Activate the virtual environment
# On Linux/macOS:
source .venv/bin/activate

# On Windows:
.venv\Scripts\activate

Step 3: Install Required Dependencies

# Install all required packages
pip install -r requirements.txt

# Install additional LangChain packages (if needed)
pip install -U langchain-huggingface langchain-text-splitters accelerate

Step 4: Add Your Legal Documents

IMPORTANT: This project does not include legal documents due to copyright restrictions.

Create the dataset directory (if it doesn't exist):
```
mkdir -p law-library/dataset
```
Download legal documents from authorized sources:
- Visit India Code for official Bare Acts
- Download PDFs of laws you want to query (e.g., IPC, Constitution, IT Act)
- Ensure documents are in PDF format

Place PDF files in the dataset directory:

# Copy your downloaded PDFs to the dataset folder
cp /path/to/your/legal-pdfs/*.pdf law-library/dataset/

Verify your documents:
```
cd law-library
ls -la dataset/  # On Linux/macOS
dir dataset\     # On Windows
```
You should see your PDF files listed. The system supports any number of PDF files.

Step 5: Verify Dataset

Ensure all 13 PDF files are in the law-library/dataset/ directory:

cd law-library
ls -la dataset/  # On Linux/macOS
dir dataset\     # On Windows

You should see 13 PDF files listed (see Legal Document Coverage section).

Step 6: Build the Vector Database

Process all legal documents and create the FAISS vector store:

# Make sure you're in the law-library directory
cd law-library  # if not already there

# Run the ingestion script
python ingest.py

Expected output:

Starting document embedding process...
================================================================================
Found X PDF files in dataset/:
  1. your-legal-document-1.pdf
  2. your-legal-document-2.pdf
  ... (and more)
================================================================================

Loading PDFs from dataset/...
✓ Loaded XXXX document pages

Splitting documents into chunks...
✓ Created XXXX chunks

Loading embedding model (sentence-transformers/all-MiniLM-L6-v2)...
✓ Embedding model loaded

Creating FAISS vector store (this may take several minutes)...
✓ FAISS vector store created

Saving vector store to vectorstore/...
================================================================================
✅ Vector store created and saved successfully!
   - Total documents processed: X
   - Total pages: XXXX
   - Total chunks: XXXX
   - Saved to: vectorstore/
================================================================================

⏱️ Note: Processing time varies based on the number and size of documents (typically 5-15 minutes).

Step 7: Launch the Application

# Start the Streamlit web interface
streamlit run app.py

Step 8: Access the Chatbot

Open your web browser and navigate to:

http://localhost:8501

🎉 Success! You can now start asking questions about Indian law.

Quick Commands Summary

# Complete setup in one go
git clone https://github.com/bPavan16/law-Rag.git
cd law-Rag
python -m venv .venv
source .venv/bin/activate  # On Linux/macOS
pip install -r requirements.txt
cd law-library
python ingest.py
streamlit run app.py

💡 Usage Examples

Ask questions about Indian law such as:

"What are the fundamental rights guaranteed by the Indian Constitution?"
"Explain the provisions of Section 420 of the Indian Penal Code"
"What are the key features of the Companies Act?"
"What are the cyber crime laws in India?"
"Explain the concept of natural justice in Indian law"

⚙️ Configuration

Model Configuration

The default configuration uses Google FLAN-T5-Base, a free and open-access model. You can modify the model in utils.py:

repo_id = 'google/flan-t5-base'  # Change model here

Other recommended free models:

google/flan-t5-large - Larger version for better performance
google/flan-t5-small - Smaller, faster version
mistralai/Mistral-7B-v0.1 - Open-source alternative

Retrieval Configuration

Adjust retrieval parameters in utils.py:

retriever=db.as_retriever(search_kwargs={'k': 2})  # Number of documents to retrieve

Text Splitting

Modify chunk parameters in ingest.py:

splitter = RecursiveCharacterTextSplitter(
    chunk_size=800,    # Chunk size
    chunk_overlap=200  # Overlap between chunks
)

🔧 System Requirements

Minimum Requirements

RAM: 8GB
Storage: 10GB free space
GPU: 4GB VRAM (optional but recommended)

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Contributing Guidelines

Fork the repository
Create a feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

📞 Support

For support, please open an issue on GitHub or contact the development team.

✨ Acknowledgments

Google for the FLAN-T5 model
LangChain team for the RAG framework
Indian legal system for comprehensive documentation
Open source community for various tools and libraries

Disclaimer: This chatbot is for educational and informational purposes only. It should not be considered as legal advice. Always consult with qualified legal professionals for legal matters.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
law-Rag		law-Rag
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

law-library: Indian Law Q&A Chatbot

🚀 Features

📚 Legal Document Coverage

� How to Add Your Legal Documents

Recommended Document Types:

📜 Constitutional & Administrative Law

⚖️ Criminal & Evidence Law

💼 Corporate & Business Law

👨‍💼 Labour & Employment Law

🌐 Specialized Legal Areas

� Where to Find Legal Documents

⚖️ Copyright Notice

🛠️ Technology Stack

📁 Project Structure

🚀 Quick Start

Prerequisites

Step-by-Step Installation

Step 1: Clone the Repository

Step 2: Create a Virtual Environment (Recommended)

Step 3: Install Required Dependencies

Step 4: Add Your Legal Documents

Step 5: Verify Dataset

Step 6: Build the Vector Database

Step 7: Launch the Application

Step 8: Access the Chatbot

Quick Commands Summary

💡 Usage Examples

⚙️ Configuration

Model Configuration

Retrieval Configuration

Text Splitting

🔧 System Requirements

Minimum Requirements

📝 License

🤝 Contributing

Contributing Guidelines

📞 Support

✨ Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages