Project: MindSpring Start Date: 2024-11-19 Status: Active Lead: Kurt Overmier
- Repository: https://github.com/kovermier/MindSpring.git
- Docs: (Not yet available)
- Boards: (Not yet available)
- Chat: (Not yet available)
- 🟢 On Track
- 🟡 At Risk
- 🔴 Blocked
- ⭐ Milestone
- 📝 Needs Review
(See memlog/memlog.md for detailed daily logs)
| ID | Task | Status | Owner | Due |
|---|---|---|---|---|
| 001 | Implement advanced topic analysis | 🟢 | Kover | - |
| 002 | Add usage statistics dashboard | 🟢 | Kover | - |
| 003 | Enhance knowledge graph visualization | 🟢 | Kover | - |
(See memlog/memlog.md for completed tasks)
MindSpring is a systematic tracking and analysis system for GPT and ClaudeAI conversation JSON exports. It provides tools for processing, visualizing, and analyzing large conversation datasets with a focus on memory efficiency, data privacy, and interactive exploration.
- Semantic Search & Analysis: Vector-based semantic search, relevance-based filtering, similar conversation discovery, topic modeling and clustering.
- Analysis Tools: Conversation pattern analysis, topic distribution insights, interactive topic mindmap, knowledge graph visualization.
- Privacy & Security: Local vector storage with Qdrant, local embedding generation with Ollama, PII protection, configurable data exclusion.
- Visualization: Streamlit-based interactive UI, knowledge graph with physics-based interactions, clickable nodes, similar conversation recommendations.
- Python 3.8+
- Ollama installed and running locally
- Dependencies listed in
requirements.txt
- Clone the repository:
git clone https://github.com/kovermier/MindSpring.git && cd MindSpring - Create a virtual environment:
python -m venv venv && source venv/bin/activate(On Windows:venv\Scripts\activate) - Install dependencies:
pip install -r requirements.txt - Install and start Ollama: Download from ollama.ai, pull the
mxbai-embed-largemodel, ensure Ollama is running.
- GPT:
conversations.json,conversations_export.json,conversations_YYYY-MM-DD.json - Claude:
claude_conversations.json,claude_conversations_export.json,claude_conversations_YYYY-MM-DD.json
- Place JSON files in the project root.
- Run:
python load_conversations.py
streamlit run Home.py (http://localhost:8501)
- Search Conversations: Semantic search bar, relevance threshold, conversation details, similar conversations.
- Topic Map: Visualize relationships, explore conversations, physics-based layout.
(See memlog/memlog.md for project structure)
conversation_vector_store.py: Manages vector embeddings and search.load_conversations.py: Processes and loads conversations.Home.py: Main Streamlit interface.1_Topic_Map.py: Topic visualization.
- Batch processing, efficient embedding generation, memory-conscious chunk processing, Qdrant search, progress tracking, error handling.
(See memlog/memlog.md for recent changes)
- Data Ingestion: Raw JSON files are split into chunks.
- Vector Processing: Text extraction, embedding generation, vector storage.
- Search & Visualization: Semantic search, topic visualization, similar conversation discovery.
- Detailed diagram:
memlog/data_flow.md
(See memlog/memlog.md)
- Fork the repository.
- Create a feature branch:
git checkout -b feature/AmazingFeature - Commit changes:
git commit -m 'Add some AmazingFeature' - Push to the branch:
git push origin feature/AmazingFeature - Open a Pull Request.
MIT License
Kurt Overmier (kurt@kurtovermier.com, www.smartbrandstrategies.com)
- Ollama, Qdrant, Streamlit, Python Data Science Ecosystem
- Design decisions and rationale for key features.
- Architectural choices and their justifications.
- Key constraints and limitations.
- Links to relevant documentation, libraries, and tools used in the project.
See memlog/project_improvements.md for details.