Skip to content

oovaa/dara

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

81 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Dara

Dara Logo

License Node.js Express LangChain Cohere

An intelligent document processing platform powered by AI

๐Ÿ“– Documentation โ€ข ๐Ÿš€ Quick Start โ€ข ๐Ÿ”ง API Reference โ€ข ๐Ÿค Contributing


๐ŸŽฏ Overview

Dara is a modern, AI-powered document processing platform developed as part of the ALX Software Engineering program. It leverages cutting-edge technologies like LangChain and Cohere AI to provide intelligent document analysis, summarization, question generation, and conversational AI capabilities - all using free and open-source alternatives.

๐ŸŒŸ Key Highlights

  • ๐Ÿค– AI-Powered: Advanced language models using entirely free alternatives (Cohere AI)
  • ๐Ÿ“„ Multi-Format Support: Process PDF, DOCX, PPTX, and TXT files seamlessly
  • ๐Ÿ”„ RESTful API: Clean, well-documented API endpoints
  • ๐Ÿ›ก๏ธ Enterprise-Ready: Built-in security, rate limiting, and error handling
  • ๐Ÿš€ Scalable Architecture: Modular design for easy deployment and scaling
  • ๐Ÿ“Š Production-Ready: Comprehensive monitoring, logging, and deployment options
  • ๐Ÿ’ฐ Cost-Effective: Uses free AI services with no usage-based billing

โœจ Features

๐Ÿ“ Document Processing

  • Multi-Format Parser: Native support for .pdf, .docx, .pptx, and .txt files
  • Intelligent Text Extraction: Advanced parsing with context preservation
  • Content Validation: Robust file type and size validation

๐Ÿง  AI-Powered Analysis

  • Smart Summarization: Generate concise, contextual summaries using Cohere AI
  • Question Generation: Create relevant questions from document content
  • Conversational AI: Interactive chat interface for document Q&A powered by Cohere
  • Answer Generation: Provide AI-powered answers based on document context
  • Confidence Scoring: Quality metrics for AI-generated content
  • Memory Management: In-memory conversation history (no external database required)

๐Ÿ”’ Security & Performance

  • Rate Limiting: Intelligent request throttling (100 req/15min)
  • Security Headers: Comprehensive protection with Helmet.js
  • CORS Protection: Configurable cross-origin resource sharing
  • Input Validation: Robust validation for all endpoints
  • Error Handling: Graceful error responses and logging

๐Ÿš€ Quick Start

Prerequisites

  • Node.js v20.0.0 or higher
  • npm v10.0.0 or higher
  • Cohere AI API Key (Get one here)

Installation

  1. Clone the repository:

    git clone https://github.com/oovaa/dara.git
    cd dara
  2. Install dependencies:

    npm install
  3. Configure environment:

    # Create environment file
    cp .env.example .env
    
    # Edit with your configuration
    nano .env

    Required environment variables:

    # AI Service Configuration
    API_KEY=your_cohere_api_key_here
    
    # Server Configuration
    PORT=3000
    FRONT_DOMAIN=http://localhost:3000
    
    # File Upload Configuration
    MAX_FILE_SIZE=10485760  # 10MB
  4. Start the application:

    # Development mode (with hot reload)
    npm run dev
    
    # Production mode
    npm start
  5. Verify installation:

    curl http://localhost:3000/

    You should see the Dara welcome page! ๐ŸŽ‰

๐Ÿ”ง Usage

Web Interface

Navigate to http://localhost:3000 to access the intuitive web interface for document upload and processing.

API Endpoints

๐Ÿ“„ Document Summarization

Generate intelligent summaries from uploaded documents.

curl -X POST http://localhost:3000/api/sum \
  -F 'file=@document.pdf'

Response:

{
  "success": true,
  "data": {
    "summary": "This document discusses...",
    "filename": "document.pdf",
    "processingTime": 3.2
  }
}

โ“ Question Generation

Create relevant questions based on document content.

curl -X POST http://localhost:3000/api/qs \
  -F 'file=@presentation.pptx'

Response:

{
  "success": true,
  "data": {
    "questions": [
      {
        "id": 1,
        "question": "What are the main topics covered?",
        "type": "factual",
        "difficulty": "easy"
      }
    ],
    "questionCount": 5
  }
}

๐Ÿ’ฌ Conversational Chat

Engage with Dara's AI assistant for interactive conversations.

curl -X POST http://localhost:3000/api/chat \
  -H 'Content-Type: application/json' \
  -d '{
    "question": "What is artificial intelligence?",
    "session": "user-session-123"
  }'

Response:

{
  "answer": "Artificial intelligence (AI) is a field of computer science that aims to create systems capable of performing tasks that typically require human intelligence..."
}

Supported File Formats

Format Extension Description
PDF .pdf Portable Document Format
Word .docx Microsoft Word documents
PowerPoint .pptx Microsoft PowerPoint presentations
Text .txt Plain text files

Rate Limits

  • 100 requests per 15-minute window per IP address
  • 10MB maximum file size
  • 50,000 characters maximum document length

๐Ÿ—๏ธ Architecture

Dara follows a modern, scalable architecture built on proven technologies:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   Frontend      โ”‚    โ”‚   Backend       โ”‚    โ”‚   AI Services   โ”‚
โ”‚   (Handlebars)  โ”‚โ—„โ”€โ”€โ–บโ”‚   (Express.js)  โ”‚โ—„โ”€โ”€โ–บโ”‚   (Cohere AI)   โ”‚
โ”‚                 โ”‚    โ”‚                 โ”‚    โ”‚                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                               โ”‚
                               โ–ผ
                       โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                       โ”‚   File System   โ”‚
                       โ”‚   (Processing)  โ”‚
                       โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ”ง Technology Stack

Layer Technologies
Backend Node.js, Express.js, ES6 Modules
AI/ML LangChain, Cohere AI, Document Loaders
Security Helmet.js, CORS, Rate Limiting, Input Validation
File Processing Multer, PDF-Parse, Office Parser
Development Nodemon, Prettier, PM2
Deployment Docker, PM2, Nginx

๐Ÿ“ Project Structure

dara/
โ”œโ”€โ”€ ๐Ÿ“‚ server/              # Backend application
โ”‚   โ”œโ”€โ”€ ๐ŸŽฎ controllers/     # Request handlers
โ”‚   โ”œโ”€โ”€ ๐Ÿ”ง middlewares/     # Express middlewares  
โ”‚   โ”œโ”€โ”€ ๐Ÿ›ฃ๏ธ  routes/         # API route definitions
โ”‚   โ””โ”€โ”€ ๐Ÿ”จ utils/           # Server utilities
โ”œโ”€โ”€ ๐Ÿ› ๏ธ  utils/              # Shared utilities
โ”‚   โ”œโ”€โ”€ ๐Ÿค– model.js         # AI model configuration
โ”‚   โ””โ”€โ”€ ๐Ÿ“„ parser.js        # Document parsers
โ”œโ”€โ”€ ๐Ÿ”ง tools/               # Processing scripts
โ”œโ”€โ”€ ๐Ÿ–ผ๏ธ  views/              # Handlebars templates
โ”œโ”€โ”€ ๐Ÿ“ค uploads/             # File upload directory
โ”œโ”€โ”€ ๐Ÿ“ฅ downloads/           # Processed files
โ””โ”€โ”€ ๐Ÿ“š docs/                # Comprehensive documentation

๐Ÿ”„ Data Flow

  1. File Upload โ†’ User uploads document via API/Web
  2. Validation โ†’ File type, size, and content validation
  3. Parsing โ†’ Extract text using appropriate document loader
  4. AI Processing โ†’ Send to Cohere AI for analysis
  5. Response โ†’ Return structured results to client

For detailed architecture information, see Architecture Documentation.

๐Ÿš€ Deployment

Quick Deployment Options

๐Ÿณ Docker (Recommended)

# Build and run with Docker
docker build -t dara .
docker run -p 3000:3000 -e API_KEY=your_key dara

โ˜๏ธ Cloud Platforms

  • AWS: ECS, EC2, Lambda
  • Google Cloud: Cloud Run, Compute Engine
  • Azure: Container Instances, App Service
  • Heroku: One-click deployment

๐Ÿ–ฅ๏ธ Traditional Servers

# Production deployment with PM2
npm install -g pm2
pm2 start ecosystem.config.cjs
pm2 save && pm2 startup

For comprehensive deployment instructions, see Deployment Guide.

๐Ÿ“– Documentation

๐Ÿ“š Complete Documentation

๐Ÿ”— Quick Links

๐Ÿ”ง Development

Prerequisites

  • Node.js v20+
  • npm v10+
  • Cohere AI API key

Development Commands

# Start development server
npm run dev

# Format code
npm run lint

# Production mode
npm start

Adding New Features

  1. Create route in server/routes/
  2. Add controller in server/controllers/
  3. Implement business logic
  4. Update documentation
  5. Test thoroughly

See Development Guide for detailed instructions.

๐Ÿ› ๏ธ Troubleshooting

Common Issues

  • Port in use: Change PORT in .env
  • API key errors: Verify API_KEY in environment
  • File upload issues: Check file format and size
  • Memory errors: Reduce file size or increase Node.js memory

For comprehensive troubleshooting, see Troubleshooting Guide.

๐Ÿ” Security

Dara implements multiple security layers:

  • Rate Limiting: 100 requests per 15 minutes
  • Input Validation: Comprehensive request validation
  • Security Headers: Helmet.js protection
  • CORS Protection: Configurable origin restrictions
  • File Validation: Type and size checking

๐Ÿค Contributing

We welcome contributions from the community! Dara is an open-source project that benefits from diverse perspectives and expertise.

๐Ÿš€ Ways to Contribute

  • ๐Ÿ› Bug Reports: Report issues and bugs
  • ๐Ÿ’ก Feature Requests: Suggest new functionality
  • ๐Ÿ“– Documentation: Improve or expand documentation
  • ๐Ÿ”ง Code Contributions: Submit pull requests
  • ๐Ÿงช Testing: Help test new features and fixes
  • ๐ŸŽจ Design: Improve UI/UX and visual design

๐Ÿ“‹ Contribution Process

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Commit your changes: git commit -m 'Add amazing feature'
  4. Push to the branch: git push origin feature/amazing-feature
  5. Open a Pull Request

๐Ÿ“ Guidelines

  • Follow existing code style and conventions
  • Write clear, descriptive commit messages
  • Include tests for new features
  • Update documentation as needed
  • Be respectful and constructive in discussions

For detailed guidelines, see CONTRIBUTING.md.

๐Ÿ“„ License

This project is licensed under the GNU General Public License v3.0.

๐Ÿ“œ License Summary

  • โœ… Commercial use - Use for commercial purposes
  • โœ… Modification - Modify the software
  • โœ… Distribution - Distribute the software
  • โœ… Patent use - Grant of patent rights
  • โœ… Private use - Use for private purposes
  • โ— Liability - No liability protection
  • โ— Warranty - No warranty provided
  • ๐Ÿ“‹ License and copyright notice - Must include
  • ๐Ÿ“‹ State changes - Must document changes
  • ๐Ÿ“‹ Disclose source - Must provide source code

See the LICENSE file for full details.


๐ŸŒŸ Star the Project

If you find Dara useful, please consider giving it a star! โญ

GitHub stars

Made with โค๏ธ by the ALX Software Engineering Community

๐Ÿ  Homepage โ€ข ๐Ÿ“– Documentation โ€ข ๐Ÿ› Issues โ€ข ๐Ÿ’ฌ Discussions

About

alx final project ๐Ÿ”ฅ

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors