Fission AI is a powerful, multimodal AI platform that combines state-of-the-art models and tools to transform text into images, videos, research reports, and more. It is designed for creators, educators, developers, and researchers who want seamless AI-powered content generation — all from a single interface.
-
🖼️ Text to Image
Generate up to 5 high-quality images using Stable Diffusion.
Captions generated with BLIP. -
🎬 Text to Video
Automatically stitches images into a video with generated narration using gTTS. -
💬 Text to Text
Uses Gemma 3B (via Ollama) to generate or rewrite text with LLM capabilities. -
📚 Text to Deep Research
- Extracts keywords
- Searches the web using DuckDuckGo API
- Summarizes findings using Gemma 3B
- Generates a downloadable PDF report
| Category | Tools / Frameworks |
|---|---|
| Backend | Flask, Python, ngrok, Google Colab |
| Frontend | Streamlit |
| Machine Learning | PyTorch, BLIP, Stable Diffusion, Gemma 3B |
| Text-to-Speech | gTTS |
| LLM Integration | Ollama (Gemma 3B) |
| Web Search | DuckDuckGo API |
| Video & Audio Tools | FFmpeg, MoviePy |
| IDE / Development | VS Code |
| Hosting / Tunnel | ngrok |
User (Streamlit App)
│
▼
[Frontend: Streamlit] ─────────┐
▼
[Backend: Flask server (ngrok tunnel)]
│
┌────────────┬──────┼───────────┬────────────┐
▼ ▼ ▼ ▼ ▼
[Stable Diffusion] [BLIP] [gTTS] [Gemma 3B] [DuckDuckGo API]
│
[Google Colab Runtime]
│
[Returns ZIP / PDF / MP4]
⚠️ Note: This project uses Google Colab as the backend runtime, tunneled using ngrok.
git clone https://github.com/avarshvir/fission_ai.git
cd fission_ai
pip install -r requirements.txt
- get ngrok token after setting your account on ngrok.
- add your token
- NGROK_AUTH_TOKEN = "your ngrok token" in main_model.ipynb google colab
- run the all cell of main_model.ipynb
- get ngrok url from colab output and place the url in main.py file that act as backend
- NGROK_URL = "ngrok_url"
- create a folder name "GeneratedImages" inside mydrive of google drive.
- make "GeneratedImages" folder public and copy the link.
- place public link in main.py file gdrive_link = "public drive link" 2 times.
after cd fission_ai
cd frontend
streamlit run main.py
Text-to-Image
Text-to-Video
Text-to-Text
Text-to Deep Research
make sure you download dependencies, setup your ngrok account,
public your "GeneratedImages" folder of Google Drive,
and choose T4 GPU of Google Colab.
fission_ai/
│
├── assets
| ├── output
| ├── final_narrated_video # Stitch images/audio to final video
| └── output.zip # Stitch images/audio to final video
|
├── backend/
│ └── backend.txt
|
├── data/
│ └── dataset_links.txt
│
├── frontend/
│ ├── images
| ├── resized_images
| ├── videos
| ├── keyword_search.py
| ├── main.py # Actual Main file
| ├── summarizer.py
| └── text_query.py
|
├── model/
│ ├── main_model.ipynb # main file for google colab
| ├── model.txt
| └── ok8.ipynb
|
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE
├── NOTICE
├── README.md
└── requirements.txt
-
Add user authentication & plans (Free, Pro, AI+, Enterprise)
-
Provide cloud-based persistent storage
-
Rate-limited access to premium APIs
-
Launch SaaS version under Jaiho Labs (Subsidiary of Jaiho Digital)
Contributions are welcome! Please follow these steps:
-
Fork the repo
-
Create your feature branch (git checkout -b feature-name)
-
Commit your changes (git commit -am 'Add new feature')
-
Push to the branch (git push origin feature-name)
-
Open a pull request
- Author: Arshvir