An intelligent document processing platform powered by AI
๐ Documentation โข ๐ Quick Start โข ๐ง API Reference โข ๐ค Contributing
Dara is a modern, AI-powered document processing platform developed as part of the ALX Software Engineering program. It leverages cutting-edge technologies like LangChain and Cohere AI to provide intelligent document analysis, summarization, question generation, and conversational AI capabilities - all using free and open-source alternatives.
- ๐ค AI-Powered: Advanced language models using entirely free alternatives (Cohere AI)
- ๐ Multi-Format Support: Process PDF, DOCX, PPTX, and TXT files seamlessly
- ๐ RESTful API: Clean, well-documented API endpoints
- ๐ก๏ธ Enterprise-Ready: Built-in security, rate limiting, and error handling
- ๐ Scalable Architecture: Modular design for easy deployment and scaling
- ๐ Production-Ready: Comprehensive monitoring, logging, and deployment options
- ๐ฐ Cost-Effective: Uses free AI services with no usage-based billing
- Multi-Format Parser: Native support for
.pdf,.docx,.pptx, and.txtfiles - Intelligent Text Extraction: Advanced parsing with context preservation
- Content Validation: Robust file type and size validation
- Smart Summarization: Generate concise, contextual summaries using Cohere AI
- Question Generation: Create relevant questions from document content
- Conversational AI: Interactive chat interface for document Q&A powered by Cohere
- Answer Generation: Provide AI-powered answers based on document context
- Confidence Scoring: Quality metrics for AI-generated content
- Memory Management: In-memory conversation history (no external database required)
- Rate Limiting: Intelligent request throttling (100 req/15min)
- Security Headers: Comprehensive protection with Helmet.js
- CORS Protection: Configurable cross-origin resource sharing
- Input Validation: Robust validation for all endpoints
- Error Handling: Graceful error responses and logging
- Node.js v20.0.0 or higher
- npm v10.0.0 or higher
- Cohere AI API Key (Get one here)
-
Clone the repository:
git clone https://github.com/oovaa/dara.git cd dara -
Install dependencies:
npm install
-
Configure environment:
# Create environment file cp .env.example .env # Edit with your configuration nano .env
Required environment variables:
# AI Service Configuration API_KEY=your_cohere_api_key_here # Server Configuration PORT=3000 FRONT_DOMAIN=http://localhost:3000 # File Upload Configuration MAX_FILE_SIZE=10485760 # 10MB
-
Start the application:
# Development mode (with hot reload) npm run dev # Production mode npm start
-
Verify installation:
curl http://localhost:3000/
You should see the Dara welcome page! ๐
Navigate to http://localhost:3000 to access the intuitive web interface for document upload and processing.
Generate intelligent summaries from uploaded documents.
curl -X POST http://localhost:3000/api/sum \
-F 'file=@document.pdf'Response:
{
"success": true,
"data": {
"summary": "This document discusses...",
"filename": "document.pdf",
"processingTime": 3.2
}
}Create relevant questions based on document content.
curl -X POST http://localhost:3000/api/qs \
-F 'file=@presentation.pptx'Response:
{
"success": true,
"data": {
"questions": [
{
"id": 1,
"question": "What are the main topics covered?",
"type": "factual",
"difficulty": "easy"
}
],
"questionCount": 5
}
}Engage with Dara's AI assistant for interactive conversations.
curl -X POST http://localhost:3000/api/chat \
-H 'Content-Type: application/json' \
-d '{
"question": "What is artificial intelligence?",
"session": "user-session-123"
}'Response:
{
"answer": "Artificial intelligence (AI) is a field of computer science that aims to create systems capable of performing tasks that typically require human intelligence..."
}| Format | Extension | Description |
|---|---|---|
.pdf |
Portable Document Format | |
| Word | .docx |
Microsoft Word documents |
| PowerPoint | .pptx |
Microsoft PowerPoint presentations |
| Text | .txt |
Plain text files |
- 100 requests per 15-minute window per IP address
- 10MB maximum file size
- 50,000 characters maximum document length
Dara follows a modern, scalable architecture built on proven technologies:
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ Frontend โ โ Backend โ โ AI Services โ
โ (Handlebars) โโโโโบโ (Express.js) โโโโโบโ (Cohere AI) โ
โ โ โ โ โ โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโ
โ File System โ
โ (Processing) โ
โโโโโโโโโโโโโโโโโโโ
| Layer | Technologies |
|---|---|
| Backend | Node.js, Express.js, ES6 Modules |
| AI/ML | LangChain, Cohere AI, Document Loaders |
| Security | Helmet.js, CORS, Rate Limiting, Input Validation |
| File Processing | Multer, PDF-Parse, Office Parser |
| Development | Nodemon, Prettier, PM2 |
| Deployment | Docker, PM2, Nginx |
dara/
โโโ ๐ server/ # Backend application
โ โโโ ๐ฎ controllers/ # Request handlers
โ โโโ ๐ง middlewares/ # Express middlewares
โ โโโ ๐ฃ๏ธ routes/ # API route definitions
โ โโโ ๐จ utils/ # Server utilities
โโโ ๐ ๏ธ utils/ # Shared utilities
โ โโโ ๐ค model.js # AI model configuration
โ โโโ ๐ parser.js # Document parsers
โโโ ๐ง tools/ # Processing scripts
โโโ ๐ผ๏ธ views/ # Handlebars templates
โโโ ๐ค uploads/ # File upload directory
โโโ ๐ฅ downloads/ # Processed files
โโโ ๐ docs/ # Comprehensive documentation
- File Upload โ User uploads document via API/Web
- Validation โ File type, size, and content validation
- Parsing โ Extract text using appropriate document loader
- AI Processing โ Send to Cohere AI for analysis
- Response โ Return structured results to client
For detailed architecture information, see Architecture Documentation.
# Build and run with Docker
docker build -t dara .
docker run -p 3000:3000 -e API_KEY=your_key dara- AWS: ECS, EC2, Lambda
- Google Cloud: Cloud Run, Compute Engine
- Azure: Container Instances, App Service
- Heroku: One-click deployment
# Production deployment with PM2
npm install -g pm2
pm2 start ecosystem.config.cjs
pm2 save && pm2 startupFor comprehensive deployment instructions, see Deployment Guide.
- ๐ API Reference - Complete REST API documentation
- ๐๏ธ Architecture Guide - System design and components
- ๐ง Development Setup - Local development environment
- ๐ Deployment Guide - Production deployment
- โ๏ธ Environment Variables - Configuration reference
- ๐ ๏ธ Troubleshooting - Common issues and solutions
- Node.js v20+
- npm v10+
- Cohere AI API key
# Start development server
npm run dev
# Format code
npm run lint
# Production mode
npm start- Create route in
server/routes/ - Add controller in
server/controllers/ - Implement business logic
- Update documentation
- Test thoroughly
See Development Guide for detailed instructions.
- Port in use: Change
PORTin.env - API key errors: Verify
API_KEYin environment - File upload issues: Check file format and size
- Memory errors: Reduce file size or increase Node.js memory
For comprehensive troubleshooting, see Troubleshooting Guide.
Dara implements multiple security layers:
- Rate Limiting: 100 requests per 15 minutes
- Input Validation: Comprehensive request validation
- Security Headers: Helmet.js protection
- CORS Protection: Configurable origin restrictions
- File Validation: Type and size checking
We welcome contributions from the community! Dara is an open-source project that benefits from diverse perspectives and expertise.
- ๐ Bug Reports: Report issues and bugs
- ๐ก Feature Requests: Suggest new functionality
- ๐ Documentation: Improve or expand documentation
- ๐ง Code Contributions: Submit pull requests
- ๐งช Testing: Help test new features and fixes
- ๐จ Design: Improve UI/UX and visual design
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request
- Follow existing code style and conventions
- Write clear, descriptive commit messages
- Include tests for new features
- Update documentation as needed
- Be respectful and constructive in discussions
For detailed guidelines, see CONTRIBUTING.md.
This project is licensed under the GNU General Public License v3.0.
- โ Commercial use - Use for commercial purposes
- โ Modification - Modify the software
- โ Distribution - Distribute the software
- โ Patent use - Grant of patent rights
- โ Private use - Use for private purposes
- โ Liability - No liability protection
- โ Warranty - No warranty provided
- ๐ License and copyright notice - Must include
- ๐ State changes - Must document changes
- ๐ Disclose source - Must provide source code
See the LICENSE file for full details.
If you find Dara useful, please consider giving it a star! โญ
Made with โค๏ธ by the ALX Software Engineering Community
๐ Homepage โข ๐ Documentation โข ๐ Issues โข ๐ฌ Discussions