Skip to content

AndrewSpiro/SSS_AS2

Repository files navigation

PentestAgent

Overview

PentestAgent is a novel LLM-driven penetration testing framework to automate intelligence gathering, vulnerability analysis, and exploitation stages, reducing manual intervention.

The framework is modular and consists of the following components:

  • Reconnaissance Agent: Gathers intelligence about the target system.
  • Planning Agent: Identifies and prioritizes vulnerabilities and potential exploits.
  • Execution Agent: Attempts to execute selected exploits in a controlled environment.

For further is information, please refer to our paper.


πŸ”§ Installation & Setup

Note: We recommend deploying this project on a Kali Linux environment for better compatibility with penetration testing tools and workflows.

1. Clone the Repository

git clone https://github.com/nbshenxm/pentest-agent.git
cd pentest-agent

2. Set Environment Variables

Several environment variables need to be filled in. If you are not familiar with environment variables, set them in the .env file.

Required:

  • PDCP_API_KEY: ProjectDiscovery API key for accessing CVE data and vulnerability information.
  • GITLAB_TOKEN: GitLab token for ExploitDB access.
  • GITHUB_KEY: GitHub token for searching repositories and issues.
  • INDEX_STORAGE_DIR: Directory to store vector indexes for RAG.
  • PLANNING_OUTPUT_DIR: Directory to save planning results.
  • LOG_DIR: Directory to store logs.

Optional:

  • http_proxy, https_proxy: If using a proxy or VPN.

3. Install Python Dependencies

Python version: 3.12

Use a virtual environment:

python3 -m venv .venv
source .venv/bin/activate
python -m pip install -r requirements.txt

or with Conda:

conda create -n pentest python=3.12
conda activate pentest
python -m pip install -r requirements.txt

4. Install CVEMAP

CVEMAP is needed to fetch CVE-related information. Follow their installation instructions.


βš™οΈ Configuration

File: pentest_agent/configs/config.yaml

(1) models

Specify the LLM provider, model name, temperature, and API key.

(2) cve

Set the model used for parsing CVE entries and its generation temperature.

(3) cve_scoring

Scoring criteria for evaluating CVEs:

  • Vulnerability type
  • Exploit maturity
  • Remote exploitability
  • Attack complexity
  • Source weighting (ExploitDB, GitHub, Google)

(4) runtime

Reconnaissance Agent:

  • current_topic: Topic identifier for current CVE task.
  • target_ip: IP address of the target host.

Planning Agent:

  • model: LLM Model used for searching exploits and analyzing vulnerability data.
  • keyword, app, version: Target application details.
  • vuln_type: Type of vulnerability to focus on.
  • cvemap_fuzzy_search: Enable fuzzy search for CVE matching.
  • output_dir: Directory to save analysis results.

Execution Agent:

  • current_topic: Task/topic identifier.
  • doc_dir: Directory containing exploit scripts or documents.
  • target_ip, target_port: IP and port of target host.
  • attacker_ip: IP of attacker's machine.
  • command_to_execute: Payload to validate exploitation.
  • model: LLM Model used for exploit execution guidance.

πŸš€ Running the Agents (Manual)

Reconnaissance Agent

  • File: pentest_agent/agents/recon_agent.py
  • Function: Given a target IP, gathers system and service info.
  • Usage: Set the topic, LLM model, and IP, then run the script.
python pentest_agent/agents/recon_agent.py

Planning Agent

  • File: pentest_agent/agents/planning_agent.py
  • Function: Identifies relevant CVEs and associated exploits from multiple sources.
  • Sources:
    • GitHub repositories and issues
    • ExploitDB entries
    • Google search results
  • Features: Multi-source intelligence aggregation with configurable LLM backends
  • Usage: Set the model and application information.
python pentest_agent/agents/planning_agent.py

Execution Agent

  • File: pentest_agent/agents/execution_agent.py
  • Function: Executes selected exploits based on previous analysis and collected context.
  • Usage: Set the topic, exploit document path, and target info.
python pentest_agent/agents/execution_agent.py

🐳 Docker Deployment

PentestAgent provides Docker support for isolated execution of each agent.

0. Pre-Configuration

Step 1: Edit pentest_agent/configs/config.yaml

Configure all agent parameters under the models, cve, cve_scoring, and runtime sections.

Step 2: Config .env in pentest_agent/docker

Example .env content:

GITHUB_KEY=your_github_token
OPENAI_API_KEY=your_openai_key
HUGGING_FACE_TOKEN=your_hf_token
INDEX_STORAGE_DIR=/path/to/indexes
PLANNING_OUTPUT_DIR=/path/to/output
LOG_DIR=/path/to/logs

1. Start and Run Reconnaissance Agent

cd pentest_agent/docker
docker-compose up --build -d recon

2. Start and Run Planning Agent

cd pentest_agent/docker
docker-compose up --build -d planning

3. Start and Run Execution Agent

cd pentest_agent/docker
docker-compose up --build -d execution

πŸ“Š Benchmark & Evaluation

Infrastructure

We adopt Vulhub for evaluating the system. Vulhub provides Docker-based vulnerable environments with real-world CVEs.

Target Selection

We select vulnerabilities based on the following criteria:

  • Must have a valid CVE ID
  • Must include a CVSS v3.x score
  • Additional labels include:
    • CWE ID
    • Exploitability sub-score
    • Difficulty levels derived from the CVSS vector

Our results

It's been a while since we performed our evaluation. We are working on including some new scenarios in addition to the VulHub in the benchmark, as well as evaluating PentestAgent on a variety of advanced LLM backbones. We will publish our results on the benchmark these works are finished.


🀝 Contribution

Feel free to open an issue if you:

  • Encounter any bugs
  • Have suggestions for improvement
  • Would like to contribute features or benchmarks

We welcome community contributions!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors