Phoenix Agent

English | 中文

Phoenix Agent is a general-purpose AI Agent system that supports safe execution of various tools and operations in a sandboxed environment.

Overview

Phoenix Agent is a Hierarchical Multi-Flow Agent System that moves beyond generic single-flow architectures by implementing task-specific workflows. Our system achieves 73.6% accuracy on GAIA Level 1 tasks through specialized flows optimized for distinct task domains.

Core Architecture

Phoenix Agent implements a hierarchical architecture where a global Orchestrator coordinates specialized sub-flows:

Search Flow: Recursive "Knowledge Gap" decomposition for application-oriented search
Code Flow: Dual-layer memory system with remote sandbox for isolated execution
Analysis Flow: Specialized reasoning workflows for complex problem-solving

Key Features

Hierarchical Multi-Flow Design: Task-specific workflows instead of generic pipelines
Intelligent Orchestration: Central coordinator routes subtasks to specialized flows
Dual-Layer Memory System: "Cold and Hot" memory for efficient context management
Secure Sandbox Environment: Isolated execution environment based on Docker/Kubernetes
Rich Tool Integration: Browser automation, Shell execution, file operations, search engines, etc.
Real-time Communication: SSE-based streaming event transmission
Visualization: VNC remote desktop support
Flexible Deployment: Supports Docker Compose and Kubernetes/Helm deployment

📄 Technical Report: For detailed architecture, methodology, and experimental results, see our technical report:
Beyond Generic Agents: A Hierarchical Multi-Flow Agent System with Task-Specific Workflows

Overall Design

Architecture Overview

Figure 1: Overview of the Hierarchical Multi-Flow Architecture with Task-Specific Workflows

System Workflow

When a user initiates a conversation:

Web sends a request to create an Agent to the Server, which creates a Sandbox through /var/run/docker.sock and returns a session ID.
The Sandbox is an Ubuntu Docker environment that starts Chrome browser and API services for tools like File/Shell.
Web sends user messages to the session ID, and when the Server receives user messages, it forwards them to the PlanAct Agent for processing.
During processing, the PlanAct Agent calls relevant tools to complete tasks.
All events generated during Agent processing are sent back to Web via SSE.

When users browse tools:

Browser:
1. The Sandbox's headless browser starts a VNC service through xvfb and x11vnc, and converts VNC to websocket through websockify.
2. Web's NoVNC component connects to the Sandbox through the Server's Websocket Forward, enabling browser viewing.
Other tools: Other tools work on similar principles.

Performance & Evaluation

GAIA Benchmark Results

Phoenix Agent has been evaluated on the General AI Assistants (GAIA) benchmark, demonstrating strong performance across multiple AI models. Our hierarchical multi-flow architecture achieves:

Metric	Score
Level 1	73.6%
Level 2	62.37%
Level 3	34.59%
Overall	22.45%

These results indicate that a hierarchical multi-flow design with task-specific workflows can achieve non-trivial performance on GAIA-style tasks without task-specific fine-tuning, demonstrating the advantage of moving beyond generic agent architectures.

Evaluated Models

Phoenix Agent supports and has been tested with the following AI models:

GLM 4.5
GLM 4.5 Air
Gemini 2.5 Pro
GPT 4.1
Sense Voice Small

For detailed experimental setup, methodology, and analysis, please refer to our technical report.

Documentation

Research & Architecture

Technical Report - Complete technical report on hierarchical multi-flow architecture, methodology, and experimental results
Architecture Design - Detailed system architecture and workflow design

Setup & Configuration

Deployment Guide - Step-by-step deployment instructions for production and development environments
Environment Variables Guide - Complete reference for all configuration options

Authors

Phoenix Agent is developed by:

Yufeng Lin
Yuzhong Zhang
Liwei Liu
Yimeng Teng
Wentao Lin
Yao Li
Ming Wen
Xuhuan Shen

Contact: {yufenglin, yuzhongzhang, liweiliu, yimengteng, wentaolin, yaoli, mingwen, xuhuanshen}@link.cuhk.edu.cn

Acknowledgments

ai-manus - Original project inspiration

Name		Name	Last commit message	Last commit date
Latest commit History 950 Commits
.github/workflows		.github/workflows
backend		backend
docs		docs
frontend		frontend
helm		helm
poc		poc
sandbox		sandbox
sandbox_agent		sandbox_agent
searxng		searxng
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
DEPLOYMENT.md		DEPLOYMENT.md
ENV_GUIDE.md		ENV_GUIDE.md
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
README_zh.md		README_zh.md
STREAMING_ARCHITECTURE_ANALYSIS.md		STREAMING_ARCHITECTURE_ANALYSIS.md
TODO.md		TODO.md
create-and-push-multi-arch.sh		create-and-push-multi-arch.sh
dev.sh		dev.sh
docker-compose-arm64-local.yml		docker-compose-arm64-local.yml
docker-compose-arm64.yml		docker-compose-arm64.yml
docker-compose-development.yml		docker-compose-development.yml
docker-compose-example.yml		docker-compose-example.yml
docker-compose.production.yml		docker-compose.production.yml
docker-compose.yml		docker-compose.yml
docker.env.example		docker.env.example
flow.md		flow.md
helm.tar		helm.tar
main.py		main.py
map.md		map.md
pyproject.toml		pyproject.toml
run.sh		run.sh
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Phoenix Agent

Overview

Core Architecture

Key Features

Overall Design

Architecture Overview

System Workflow

Performance & Evaluation

GAIA Benchmark Results

Evaluated Models

Documentation

Research & Architecture

Setup & Configuration

Authors

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Phoenix Agent

Overview

Core Architecture

Key Features

Overall Design

Architecture Overview

System Workflow

Performance & Evaluation

GAIA Benchmark Results

Evaluated Models

Documentation

Research & Architecture

Setup & Configuration

Authors

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages