Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation

This repository contains the code and simulation environments used in the paper:

"Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation"
Lucas C. D. Bezerra, Ataíde M. G. dos Santos, and Shinkyu Park
Accepted: IEEE Robotics and Automation Letters, 2025

TL;DR:

This repository contains:

GyMRT²A Environment: a Gym environment for discrete-space, discrete-time MRTA with multi-robot tasks.
MARL-MRT²A: a MAPPO-based algorithm that enables learning of decentralized, low-communication, generalizable task allocation policies for MRT²A; this implementation builds upon the EPyMARL codebase.
PCFA Baseline: our implementation of the decentralized market-based approach PCFA.

To get started, please see the Installation section and run the provided examples.

Overview

We propose a decentralized, learning-based framework for dynamic coalition formation in Multi-Robot Task Allocation (MRTA) under partial observability. Our method extends MAPPO with multiple integrated components that allow robots to coordinate and revise task assignments in dynamic, partially observable environments.

Key Components

Spatial Action Maps: Agents select task locations in spatial coordinates, enabling long-horizon task planning.
Robot Motion Planning: Each robot computes a collision-free A* path to the selected task.
Intention Sharing: Robots share decayed path-based intention maps with nearby agents to support coordination.
Custom Policy Architecture: We propose using a U-Net as the policy architecture, but our code supports custom architecture (that can be implemented as a torch.nn.Sequential; see nn_utils.py for modules that are currently available)

Environment

We implement our experiments in a custom Gym environment called GyMRT²A, which simulates:

Grid-world task allocation with dynamic task spawns (Bernoulli or instant respawn)
Partial observability (limited view and communication ranges)
Multi-level tasks requiring varying coalition sizes
Motion planning with obstacles and other agents

Repository Structure

marl_mrt2a/
├── marl_mrt2a/
│   ├── env/               # GyMRT²A environment
│   ├── PCFA/              # Baseline implementation
│   ├── marl/              # Our method's implementation
│   └── examples/          # Reproducible experiments
│       └── main_comparison/    # Comparison with baseline and ablation studies
├── LICENSE
└── README.md

Installation

Prerequisites

Python 3.10 or higher
PyTorch
NumPy
OpenAI Gym

Setup

Clone the repository:

git clone https://github.com/lcdbezerra/marl_mrt2a.git
cd marl_mrt2a

Create a conda environment:

conda create -n marl_mrt2a python=3.10 -y
conda activate marl_mrt2a
conda install pip -y

Install the base environment and the baseline (development mode)

cd marl_mrt2a/env
pip install -e .
pip install -U pygame --user
conda install -c conda-forge libstdcxx-ng -y
cd ../PCFA
pip install -e .
cd ../

Setup Weights & Biases for experiment tracking

pip install wandb
wandb login

Install MARL dependencies:

cd marl
pip install -r requirements.txt

Experiments

Experiments are reproducible through the examples in the examples/ directory:

Main Comparison (`examples/main_comparison/`)

Compare the proposed method against baseline approaches:

Traditional task allocation methods
Standard MAPPO
Other multi-agent learning approaches

Running Experiments

# Run main comparison experiments
python examples/main_comparison/run_comparison.py

Citation

If you use this code, please cite:

@article{bezerra2025learningdcfmrta,
  author={Lucas C. D. Bezerra and Ataíde M. G. dos Santos and Shinkyu Park},
  journal={IEEE Robotics and Automation Letters}, 
  title={Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation}, 
  year={2025},
  volume={10},
  number={9},
  pages={9216-9223},
  doi={10.1109/LRA.2025.3592080}}

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

For information about derivative works and third-party components, see the NOTICE file.

Third-Party Code

This repository includes code under the Apache License 2.0:

Multi-agent reinforcement learning framework based on EPyMARL and PyMARL
astar.py – A* pathfinding implementation from Red Blob Games, Copyright 2014 Red Blob Games, licensed under Apache License 2.0. Adapted by Lucas C. D. Bezerra.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
marl_mrt2a		marl_mrt2a
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation

TL;DR:

Overview

Key Components

Environment

Repository Structure

Installation

Prerequisites

Setup

Experiments

Main Comparison (`examples/main_comparison/`)

Running Experiments

Citation

Contributing

License

Third-Party Code

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Learning Policies for Dynamic Coalition Formation in Multi-Robot Task Allocation

TL;DR:

Overview

Key Components

Environment

Repository Structure

Installation

Prerequisites

Setup

Experiments

Main Comparison (examples/main_comparison/)

Running Experiments

Citation

Contributing

License

Third-Party Code

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Main Comparison (`examples/main_comparison/`)

Packages