Skip to content

YoshaM09/MultimodalVideoAnalysis

Repository files navigation

🎥 Multimodal Video-Analysis System

React TypeScript Gemini API YouTube License: MIT

An AI-powered multimodal video-analysis system that allows users to chat with YouTube videos, navigate through timestamped sections, and run visual content searches. This system helps users quickly find and reference specific parts of long videos.

✨ Features

  • Video Upload & Chat: Upload a YouTube link and chat naturally with the video.
  • Timestamped Section Breakdown: Automatically generates a structured video outline with hyperlinked timestamps.
  • Contextual Citations: Chat responses include timestamp hyperlinks, redirecting users to the exact referenced moment.
  • Visual Search: Accepts natural language queries about visual frames or content and returns matching video clips.
  • Time-Saving Navigation: Helps users skip directly to relevant parts of long-form video content.

🛠 Tech Stack

  • Frontend: React, TypeScript
  • AI Backend: Gemini API (multimodal video + text analysis)
  • Data Source: YouTube video links
  • UI Enhancements: TailwindCSS, shadcn/ui components (optional)

🚀 Setup & Installation

1. Clone the repository

git clone https://github.com/YoshaM09/MultimodalVideoAnalysis.git
cd MultimodalVideoAnalysis

2. Install dependencies

pip install -r requirements.txt

3. Configure environment variables

  • Create a .env file in the project root with your API key:
GEMINI_API_KEY=your_gemini_api_key

4. Run the application

npm run dev

🎬 Usage

  1. Open the web app in your browser.
  2. Enter a YouTube video link.
  3. Explore:
    • View the section breakdown with timestamp hyperlinks.
    • Chat with the video to ask questions about its content.
    • Get timestamped answers pointing to the exact moment.
    • Run a visual content search with natural language queries to retrieve relevant clips.

📺 Demo

IMAGE ALT TEXT HERE

🤝 Contributing

  • Contributions are welcome! Please submit a pull request or open an issue for suggestions.

📄 License

  • This project is licensed under the MIT License.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors