AgenticX-GUIAgent: Autonomous Mobile GUI Agent System on AgenticX

AgenticX-GUIAgent is a multi-agent system built on agenticx (v0.2.1). It automates complex Android GUI operations from natural language instructions, integrating multimodal reasoning, knowledge management, and learning for continuous improvement.

Key Features

Multi-agent collaboration: Manager, Executor, Reflector, Notetaker work together.
Knowledge-driven: A shared knowledge pool improves success rates over time.
Learning loop: A data flywheel enables continuous optimization.
Multimodal understanding: Reasoning over screenshots and UI context.
Extensible: Easy to add agents, tools, and capabilities.

Architecture

The system is layered and modular. Core components are:

graph TD
    subgraph "User/Developer"
        A[User/Developer]
    end

    subgraph "AgenticX-GUIAgent System (core.system.AgenticXGUIAgentSystem)"
        B(AgenticXGUIAgentSystem)
        C(AgenticX Agent)
        D(Workflow Engine)
        E(Event Bus)
        F(InfoPool)

        B -- "manage" --> C
        B -- "orchestrate" --> D
        B -- "use" --> E
        B -- "use" --> F
    end

    subgraph "Agents (agents)"
        G(ManagerAgent)
        H(ExecutorAgent)
        I(ActionReflectorAgent)
        J(NotetakerAgent)

        C -- "contains" --> G
        C -- "contains" --> H
        C -- "contains" --> I
        C -- "contains" --> J
    end

    subgraph "Core Coordination (core)"
        K(AgentCoordinator)
        L(BaseAgenticXGUIAgentAgent)

        D -- "execute" --> K
        K -- "coordinate" --> G
        K -- "coordinate" --> H
        K -- "coordinate" --> I
        K -- "coordinate" --> J
        C -- "inherit" --> L
    end

    subgraph "Knowledge (knowledge)"
        M(KnowledgeManager)
        N(KnowledgePool)
        O(HybridEmbeddingManager)

        F -- "contain" --> M
        F -- "contain" --> N
        M -- "use" --> O
        J -- "write" --> N
        G -- "read" --> N
        H -- "read" --> N
        I -- "read" --> N
    end

    subgraph "Learning (learning)"
        P(RLEnhancedLearningEngine)
        Q(LearningCoordinator)
        R(RL Core)
        S(Data Flywheel)

        P -- "include" --> Q
        P -- "include" --> R
        R -- "drive" --> S

        subgraph "RL Core Components"
            R1(MobileGUIEnvironment)
            R2(MultimodalStateEncoder)
            R3(Policy Networks)
            R4(ExperienceReplayBuffer)
            R5(RewardCalculator)
            R6(PolicyUpdater)
        end

        R -- "contain" --> R1
        R -- "contain" --> R2
        R -- "contain" --> R3
        R -- "contain" --> R4
        R -- "contain" --> R5
        R -- "contain" --> R6

        H -- "interact" --> R1
        I -- "feedback" --> R5
        J -- "knowledge" --> R2
        S -- "optimize" --> R3
    end

    A -- "request" --> B
    G -- "decompose" --> H
    H -- "execute" --> I
    I -- "reflect" --> G
    I -- "generate experience" --> R4
    J -- "record" --> M

Directory Structure

This section covers the root files and key subdirectories under AgenticX-GUIAgent/.

Root Files

.gitignore: Git ignore rules.
LICENSE: License.
README.md: Project documentation (this file).
README_zn.md: Chinese documentation.
requirements.txt: Python dependencies.
setup.sh: Automated environment setup script.
config.yaml: Default runtime configuration (LLM, knowledge, learning, evaluation).
config.py: Configuration data models and validation.
main.py: System entry point (initialization, execution, interactive mode).
utils.py: Common utilities (logging, config loading, retry, JSON).
check_adb.py: ADB diagnostics for device connectivity.
cli_knowledge_manager.py: Knowledge base CLI (status/query/export).

Key Directories

agents/: Core agent implementations (Manager/Executor/Reflector/Notetaker).
core/: Core components (base agent, InfoPool, context, coordinator).
tools/: GUI tools and executor (ADB/basic/smart tools).
knowledge/: Knowledge management (storage, retrieval, embeddings).
learning/: Learning engine (five-stage learning + RL core).
evaluation/: Evaluation framework (metrics, benchmarks, reports).
workflows/: Multi-agent collaboration workflow orchestration.
docker/: Docker/Compose configs and optional services.
tests/: Test cases and test resources.

Requirements

Hardware

CPU: 4+ cores
Memory: 8GB+ (16GB recommended)
Storage: 10GB free
Android device: Android 8.0+ with ADB debugging enabled

Software

Python 3.9+
Conda (Anaconda/Miniconda)
ADB (Android Debug Bridge)
Git

Setup

We provide both automated and manual setup.

1. Automated Setup (Recommended)

Run setup.sh to prepare the environment and dependencies:

bash setup.sh

This script will:

Check Conda, ADB, Python.
Create agenticx-guiagent conda env.
Install dependencies and AgenticX (editable).
Create a run.sh launcher.
Attempt to create .env from .env.example if it exists (otherwise create .env manually).

2. Manual Setup

If you prefer manual setup, follow the steps below.

Step 1: Create environment

conda create -n agenticx-guiagent python=3.9 -y
conda activate agenticx-guiagent

Step 2: Install dependencies

pip install --upgrade pip
pip install -r requirements.txt

# Install AgenticX (editable)
# cd /path/to/AgenticX
# pip install -e .

# Optional mobile control tools
pip install adbutils pure-python-adb

Step 3: Configure environment variables

Create a .env file (or use your own environment variable manager). The default config.yaml uses Bailian; adjust as needed.

nano .env

Example .env:

# Bailian (default in config.yaml)
BAILIAN_API_KEY=your_bailian_api_key
BAILIAN_CHAT_MODEL=qwen-vl-max
BAILIAN_API_BASE=https://dashscope.aliyuncs.com/compatible-mode/v1
BAILIAN_EMBEDDING_MODEL=text-embedding-v4

# Optional app settings
DEBUG=true
LOG_LEVEL=INFO

Step 4: Prepare Android device and ADB

Enable Developer Options:
- Settings → About phone → tap "Build number" 7 times
Enable USB debugging:
- Settings → Developer options → USB debugging
- Settings → Developer options → USB installation
Connect device:
- Connect via USB
- Authorize USB debugging on device

Verify ADB:

adb version
adb start-server
adb devices

Run

1. Start AgenticX-GUIAgent

Make sure your Android device is connected.

Interactive mode:

./run.sh --interactive
# or
python main.py --interactive

Single-task mode:

./run.sh --task "Open WeChat and send a message to Alice"
# or
python main.py --task "Open WeChat and send a message to Alice"

2. Other options

# Enable evaluation
python main.py --task "Open Settings" --evaluate

# Use custom config
python main.py --config custom_config.yaml

# Set log level
python main.py --log-level DEBUG

Examples

Example 1: Send a WeChat message

"Send a WeChat message to Jennifer: I will be home for dinner tonight."

Typical flow:

Manager decomposes the task
Executor performs GUI actions
ActionReflector validates results
Notetaker records knowledge

Example 2: Set an alarm

"Set an alarm for 8:00 AM tomorrow with note: meeting"

Typical flow: Open Clock → Create alarm → Set time → Add note → Save

Example 3: Multi-step app task

"Open TikTok, search for food videos, like the top 3."

Typical flow: Launch app → Search → Browse results → Like top 3

Docker Deployment

We provide Docker and Docker Compose configs for containerized environments.

Enter the docker directory:
```
cd docker
```
Configure environment variables:
```
cp env.example .env
nano .env
```
Start services:
```
docker-compose up --build
```
Note: For USB access, you may need --privileged -v /dev/bus/usb:/dev/bus/usb.

See docker/README.md for details.

Troubleshooting

Common issues

ADB connection fails:

adb kill-server
adb start-server
adb devices

Dependency install fails:

conda activate agenticx-guiagent
pip install --upgrade pip
pip cache purge
pip install -r requirements.txt --force-reinstall

LLM API failures:
- Check API keys in .env.
- Ensure network access.
- Verify account quota.
Device actions fail:
- Ensure device is unlocked.
- Ensure target app is installed.
- Ensure USB debugging is enabled.

Logs

python main.py --log-level DEBUG

# Tail file logs if configured:
# tail -f logs/agenticx-guiagent.log

Development & Testing

Run tests:
```
pytest
```

Code style:

pre-commit install
pre-commit run --all-files

Support & Feedback

Repo: https://github.com/DemonDamon/AgenticX-GUIAgent (replace with actual)
Issues: open a GitHub issue

Note: Make sure environment variables and device connections are configured before running. Start with a simple task first.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
agents		agents
core		core
docker		docker
evaluation		evaluation
knowledge		knowledge
learning		learning
tests		tests
tools		tools
workflows		workflows
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README_zn.md		README_zn.md
check_adb.py		check_adb.py
cli_knowledge_manager.py		cli_knowledge_manager.py
config.py		config.py
config.yaml		config.yaml
main.py		main.py
requirements.txt		requirements.txt
setup.sh		setup.sh
utils.py		utils.py

Folders and files

Latest commit

History

Repository files navigation

AgenticX-GUIAgent: Autonomous Mobile GUI Agent System on AgenticX

Key Features

Architecture

Directory Structure

Root Files

Key Directories

Requirements

Hardware

Software

Setup

1. Automated Setup (Recommended)

2. Manual Setup

Step 1: Create environment

Step 2: Install dependencies

Step 3: Configure environment variables

Step 4: Prepare Android device and ADB

Run

1. Start AgenticX-GUIAgent

2. Other options

Examples

Example 1: Send a WeChat message

Example 2: Set an alarm

Example 3: Multi-step app task

Docker Deployment

Troubleshooting

Common issues

Logs

Development & Testing

Support & Feedback

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages