This project demonstrates a Retrieval-Augmented Generation (RAG) system using the Ollama model for document-based question answering. The goal of the project is to integrate documents, retrieve relevant information, and generate concise answers using the Ollama Gemma 2B model.
- Loads and processes news articles from a specified directory (
news_articles). - Splits documents into manageable chunks for efficient retrieval.
- Uses ChromaDB for vector storage and retrieval of document embeddings.
- Generates embeddings for documents using the Ollama Gemma 2B model.
- Queries the document database and generates answers using Ollama's conversational model.
-
Clone the repository:
git clone https://github.com/your-username/RAG-Based-Data-Retrieval.git cd RAG-Based-Data-Retrieval -
Create a virtual environment:
python -m venv venv
-
Activate the virtual environment:
-
On Windows:
venv\Scripts\activate
-
On macOS/Linux:
source venv/bin/activate
-
-
Install required packages:
pip install -r requirements.txt
-
Set up environment variables by creating a
.envfile in the root directory:CHROMA_PERSISTENT_STORAGE_PATH="chroma_persistent_storage" -
Make sure you have access to Ollama for the embeddings and model queries.
-
Prepare a folder
news_articles/containing.txtfiles of articles you wish to process. -
Run the application:
python app.py
-
Query the system by providing a question. The system will fetch relevant document chunks, generate embeddings, and generate a response based on the retrieved context.
- Loading Documents: The
load_documents_from_directoryfunction loads all.txtfiles from a specified directory and stores their content as documents. - Text Chunking: The
split_textfunction splits long documents into smaller, manageable chunks to allow for more accurate retrieval. - Embedding Generation: The
get_ollama_embeddingfunction generates embeddings for document chunks using the Gemma 2B model from Ollama. - Storage in ChromaDB: The embeddings are stored in ChromaDB for fast retrieval during queries.
- Query Processing: The
query_documentsfunction generates an embedding for the user’s query and retrieves the most relevant document chunks. The system uses the retrieved chunks to generate an answer. - Answer Generation: The
generate_responsefunction combines the relevant context and generates concise answers using Ollama's conversational model.
You can use this example to query the system:
question = "What is human life expectancy in the US and Bangladesh?"
relevant_chunks = query_documents(question)
answer = generate_response(question, relevant_chunks)
print(answer)