PentestAgent

Overview

PentestAgent is a novel LLM-driven penetration testing framework to automate intelligence gathering, vulnerability analysis, and exploitation stages, reducing manual intervention.

The framework is modular and consists of the following components:

Reconnaissance Agent: Gathers intelligence about the target system.
Planning Agent: Identifies and prioritizes vulnerabilities and potential exploits.
Execution Agent: Attempts to execute selected exploits in a controlled environment.

For further is information, please refer to our paper.

🔧 Installation & Setup

Note: We recommend deploying this project on a Kali Linux environment for better compatibility with penetration testing tools and workflows.

1. Clone the Repository

git clone https://github.com/nbshenxm/pentest-agent.git
cd pentest-agent

2. Set Environment Variables

Several environment variables need to be filled in. If you are not familiar with environment variables, set them in the .env file.

Required:

PDCP_API_KEY: ProjectDiscovery API key for accessing CVE data and vulnerability information.
GITLAB_TOKEN: GitLab token for ExploitDB access.
GITHUB_KEY: GitHub token for searching repositories and issues.
INDEX_STORAGE_DIR: Directory to store vector indexes for RAG.
PLANNING_OUTPUT_DIR: Directory to save planning results.
LOG_DIR: Directory to store logs.

Optional:

http_proxy, https_proxy: If using a proxy or VPN.

3. Install Python Dependencies

Python version: 3.12

Use a virtual environment:

python3 -m venv .venv
source .venv/bin/activate
python -m pip install -r requirements.txt

or with Conda:

conda create -n pentest python=3.12
conda activate pentest
python -m pip install -r requirements.txt

4. Install CVEMAP

CVEMAP is needed to fetch CVE-related information. Follow their installation instructions.

⚙️ Configuration

File: `pentest_agent/configs/config.yaml`

(1) models

Specify the LLM provider, model name, temperature, and API key.

(2) cve

Set the model used for parsing CVE entries and its generation temperature.

(3) cve_scoring

Scoring criteria for evaluating CVEs:

Vulnerability type
Exploit maturity
Remote exploitability
Attack complexity
Source weighting (ExploitDB, GitHub, Google)

(4) runtime

Reconnaissance Agent:

current_topic: Topic identifier for current CVE task.
target_ip: IP address of the target host.

Planning Agent:

model: LLM Model used for searching exploits and analyzing vulnerability data.
keyword, app, version: Target application details.
vuln_type: Type of vulnerability to focus on.
cvemap_fuzzy_search: Enable fuzzy search for CVE matching.
output_dir: Directory to save analysis results.

Execution Agent:

current_topic: Task/topic identifier.
doc_dir: Directory containing exploit scripts or documents.
target_ip, target_port: IP and port of target host.
attacker_ip: IP of attacker's machine.
command_to_execute: Payload to validate exploitation.
model: LLM Model used for exploit execution guidance.

🚀 Running the Agents (Manual)

Reconnaissance Agent

File: pentest_agent/agents/recon_agent.py
Function: Given a target IP, gathers system and service info.
Usage: Set the topic, LLM model, and IP, then run the script.

python pentest_agent/agents/recon_agent.py

Planning Agent

File: pentest_agent/agents/planning_agent.py
Function: Identifies relevant CVEs and associated exploits from multiple sources.
Sources:
- GitHub repositories and issues
- ExploitDB entries
- Google search results
Features: Multi-source intelligence aggregation with configurable LLM backends
Usage: Set the model and application information.

python pentest_agent/agents/planning_agent.py

Execution Agent

File: pentest_agent/agents/execution_agent.py
Function: Executes selected exploits based on previous analysis and collected context.
Usage: Set the topic, exploit document path, and target info.

python pentest_agent/agents/execution_agent.py

🐳 Docker Deployment

PentestAgent provides Docker support for isolated execution of each agent.

0. Pre-Configuration

Step 1: Edit `pentest_agent/configs/config.yaml`

Configure all agent parameters under the models, cve, cve_scoring, and runtime sections.

Step 2: Config `.env` in `pentest_agent/docker`

Example .env content:

GITHUB_KEY=your_github_token
OPENAI_API_KEY=your_openai_key
HUGGING_FACE_TOKEN=your_hf_token
INDEX_STORAGE_DIR=/path/to/indexes
PLANNING_OUTPUT_DIR=/path/to/output
LOG_DIR=/path/to/logs

1. Start and Run Reconnaissance Agent

cd pentest_agent/docker
docker-compose up --build -d recon

2. Start and Run Planning Agent

cd pentest_agent/docker
docker-compose up --build -d planning

3. Start and Run Execution Agent

cd pentest_agent/docker
docker-compose up --build -d execution

📊 Benchmark & Evaluation

Infrastructure

We adopt Vulhub for evaluating the system. Vulhub provides Docker-based vulnerable environments with real-world CVEs.

Target Selection

We select vulnerabilities based on the following criteria:

Must have a valid CVE ID
Must include a CVSS v3.x score
Additional labels include:
- CWE ID
- Exploitability sub-score
- Difficulty levels derived from the CVSS vector

Our results

It's been a while since we performed our evaluation. We are working on including some new scenarios in addition to the VulHub in the benchmark, as well as evaluating PentestAgent on a variety of advanced LLM backbones. We will publish our results on the benchmark these works are finished.

🤝 Contribution

Feel free to open an issue if you:

Encounter any bugs
Have suggestions for improvement
Would like to contribute features or benchmarks

We welcome community contributions!

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
agents		agents
configs		configs
data/benchmark		data/benchmark
docker		docker
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
generate_requirements.py		generate_requirements.py
requirements-new.txt		requirements-new.txt
requirements.txt		requirements.txt
update_dependencies.sh		update_dependencies.sh

Folders and files

Latest commit

History

Repository files navigation

PentestAgent

Overview

🔧 Installation & Setup

1. Clone the Repository

2. Set Environment Variables

3. Install Python Dependencies

4. Install CVEMAP

⚙️ Configuration

File: pentest_agent/configs/config.yaml

(1) models

(2) cve

(3) cve_scoring

(4) runtime

🚀 Running the Agents (Manual)

Reconnaissance Agent

Planning Agent

Execution Agent

🐳 Docker Deployment

0. Pre-Configuration

Step 1: Edit pentest_agent/configs/config.yaml

Step 2: Config .env in pentest_agent/docker

1. Start and Run Reconnaissance Agent

2. Start and Run Planning Agent

3. Start and Run Execution Agent

📊 Benchmark & Evaluation

Infrastructure

Target Selection

Our results

🤝 Contribution

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

File: `pentest_agent/configs/config.yaml`

Step 1: Edit `pentest_agent/configs/config.yaml`

Step 2: Config `.env` in `pentest_agent/docker`

Packages