Skip to content

drod-96/rag_case_study

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

RAG Case Study: Simple RAG study from a document knowledge

This repository demonstrates a Retrieval-Augmented Generation (RAG) pipeline designed to extract and synthesize information from specific documents (eg. cats information)

It features both an interactive Jupyter Notebook for a step-by-step walkthrough of the RAG logic.


πŸ—οΈ Architecture

The system follows a standard RAG workflow as illustrated in images/rag_logic.png:

  1. Ingestion: Loading .docx files from the data/ directory
  2. Chunking: Splitting documents into smaller, semantically meaningful segments
  3. Embedding: Generating vector representations using OLLAMA embedding models
  4. Vector Store: Storing embeddings in a local vector database for similarity search
  5. Retrieval & Generation: Fetching relevant context to augment the LLM's response using top 3 relevant information

πŸ“‚ Project Structure

  • rag_case_study.ipynb: The core notebook containing the experimental RAG pipeline.
  • src/app.py: A Streamlit-based user interface to interact with the RAG system.
  • data/: Contains source documents (PDFs and Word docs).
  • images/: Visual representation of the RAG logic.
  • requirements.txt: List of necessary Python libraries.

πŸš€ Next step

This project is a test and is therefore very simplist. All parts of the provided RAG system can be improved and include complex knowledge such as Energy System simulation results to support informed decisions. This is the subject of future works, stay tune :)

About

Case study of Retrieval Augmented Generation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors