Zhi Wang, Li Zhang, Wenhao Wu, Yuanheng Zhu, Dongbin Zhao, Chunlin Chen*
A link to our paper can be found on arXiv
Official codebase for Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement
Experiments require MuJoCo and D4RL. Follow the instructions in the [MuJoCo][D4RL] to install.
Create a virtual environment using conda, and see requirments.txt file for more information about how to install the dependencies.
conda create -n meta_dt python=3.8.18 -y
conda activate meta_dt
pip install -r requirements.txtNote that we set done = False in all environments, so we need to set done = False for environments walker and hopper manually in package rand_param_envs.
We also share our datasets below.
We use SAC to train agents on different environments and collect datasets.
Train agents on different tasks in AntDir:
python train_data_collection.py --env_type ant_dir --save_freq 4000 --task_id_start 0 --task_id_end 5in which task_id_start and task_id_end mean that training tasks of [task_id_start, task_id_end).
We use checkpoints of traning process to generate datasets.
For medium and expert datasets, use:
python get_datassets.py --env_type ant_dir --data_type medium --task_id_start 0 --task_id_end 5 --capacity 20000After obtaining datasets of all tasks, we should manually merge all task_info_{task_id}.json files into one file named task_info.json.
For medium-expert datasets, we use a mix of 70% medium and 30% expert datasets.
- We share our datasets via this datasets
- We share our pretrained world model via this world_model
Train the context encoder using world model
python train_context.py --env_name AntDir-v0Train the Meta Decision Transformer for few_shot Meta-DT
python train_meta_dt.py --env_name AntDir-v0 --zero_shot False --data_quality medium Train the Meta Decision Transformer for zero_shot Meta-DT
python train_meta_dt.py --env_name AntDir-v0 --zero_shot True --data_quality mediumPlease cite our paper as:
@inproceedings{
wang2024metadt,
title={Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement},
author={Zhi Wang and Li Zhang and Wenhao Wu and Yuanheng Zhu and Dongbin Zhao and Chunlin Chen},
booktitle={Advances in Neural Information Processing Systems},
year={2024},
}