A local RAG-based AI assistant to retrieve team workflows and processes information from locally stored documents.
- This DOES NOT contain any actual data, it only has the code to use with your own data.
- This DOES NOT provide any trained or finetuned models, it is using out-of-the-box IBM Granite + Nomic Embed
- This is a generic software, it can be used as a generic rag-based LLM. There is no company data in use.
- This is a proof-of-concept. None of the code is finalized, and it will probably change a lot.
- Don't expect any sort of magical working code, I'm not Linus Torvalds.
- Ollama: For the local LLM interface (running locally)
- IBM Granite-3.3:2b Model: For the local LLM model (can be configured)
- nomic-embed-text: For the local embedding model (can be configured)
- ChromaDB: For the local vector database
- Docling: For the local document conversion and formatting
- Rich: For the CLI user experience
- Install required dependencies
pip install -r requirements.txt- Create a docs folder to store your documents:
cd <my_repo_clone>
mkdir ./docs/- Run ollama locally:
ollama serve- Pull required models:
ollama pull granite3.3:2b
ollama pull nomic-embed-text:latest- One-shot prompting: where the prompt is provided through command-line arguments
./cta.py How do I format a PR commit message? - Interactive prompting: where the AI can be continuously prompted
./cta.py --interactive- Prompting with streamed responses: where the AI response is generated in a stream rather than waiting for the entire response to generate
./cta.py --stream How do I format a PR commit message?- Set a custom directory to pull documentation from (uses
./docs/by default):
./cta.py --docs my/path/to/docs/Refer to the cli help instructions here: ./cta.py --help
A local RAG-based AI assistant to retrieve team workflows and processes information.
Positional Arguments:
prompt prompt for the AI assistant
Options:
-h, --help show this help message and exit
-s, --stream enable streaming generated responses
-i, --interactive enable interactive prompting
-d, --docs DOCS path to local directory of documents to use for rag prompting
- You can change the system prompt, llm/embed models, and the default docs directory in the
config.pyfile. - Supported document types:
- DOCX
- XLSX
- HTML
- Code Cleanup: This code needs to be cleaned up... a lot.
- Integrations: A slack bot integration would make this much easier to use.
- Prompt Improvements: This still hallucinates sometimes, improved prompt engineering may help. It's rather simple right now.
- Model Improvements: There's probably a better model I could use that's within data privacy policy.
- Automated Document Upload: A way to automatically hook into new/updated documents instead of manually putting documents into a folder would be helpful.
- Stop Generating The Vector Embeddings Every Time: This can probably speed up a lot by storing the chromadb vector database persistently instead of regenerating it every time.
