🍓 Marco-o1: Towards Efficient Reasoning Models

⭐Alibaba International Digital Commerce⭐

Github 🤗 Hugging Face 📝 Paper 🧑‍💻 Model 🗂️ Data 📽️ Demo

The Timeline of Marco-o1

🎯 Marco-o1 not only focuses on subjects with standard answers, such as mathematics, physics, and coding that are highly suitable for the use of Reinforcement Learning, but we also emphasize some open-ended solutions. Our goal is to build a general model applicable to agentic, incorporating comprehensive planning capabilities and function call abilities.

🔥 News

[Coming Soon] 🏃 Marco-o1 Agentic: A more powerful agentic model is coming soon ... ...
[2025/02/09] 🔥 EDPO (Difficulty-Estimated Policy Optimization): We proposed an optimization algorithm based on an online data difficulty selector. To our knowledge, this is the first work on online data selection. Experiments show that compared with GRPO, we can better resist the noise interference caused by Zero Advantage, achieving an average performance improvement of 2.4%. At the same time, this online selector can also provide multi-scale routing based on prompt difficulty in large-scale online services.
[2025/02/09] 🔥 The paper A State-Transition Framework for Efficient LLM Reasoning has been accepted by ICLR 2026.
[2025/02/09] 🔥 We released Marco-o1 v3. By training a pluggable Linear component MAM (Mixed Attention Module) on the existing dense model, we were able to dynamically compress the model to save context tokens. At the same time, we introduced TTT (Test-Time Training), and ultimately we achieved a 20% reduction in inference cost while obtaining an average performance improvement of 4.7%.
[2025/05/15] 🔥 The paper Marco-o1 v2: Towards Widening The Distillation Bottleneck for Reasoning Models has been accepted by ACL 2025.
[2025/02/14] 🔥 We released Marco-o1 v2, entirely relies on self-built data and has undergone DPO. It has been optimized more comprehensively for mathematical problem-solving、planning and instruction-following capabilities. 🍬 This time, our model's ability in counting letters is quite impressive! 😁
[2024/11/13] 🔥 We released Marco-o1 v1, towards open reasoning models for open-ended solutions. This includes our reasoning model, optimized for complex problem-solving and versatile applications across various domains.

⚡️ Released Resources

Codes and Models

📥 Marco-o1 v1

📥 Marco-o1 v2

💻 Marco-o1 v3

💻 Marco-o1 DEPO

Installation

To install Marco-o1, follow these steps:

# Clone the repository
git clone https://github.com/AIDC-AI/Marco-o1

# Change to the Macaw-LLM directory
cd Marco-o1

# Install required packages
pip install -r requirements.txt

Usage

Load Marco-o1-CoT model:

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("AIDC-AI/Marco-o1")
model = AutoModelForCausalLM.from_pretrained("AIDC-AI/Marco-o1")

Inference:

Execute the inference script (you can give any customized inputs inside):
```
./src/output/talk_with_model.py

# Use vLLM
./src/output/talk_with_model_vllm.py
```
Deploy using FastAPI:

Check the README.md file in examples folder.

👨🏻‍💻 Acknowledgement

Main Contributors

From MarcoPolo Team, AI Business, Alibaba International Digital Commerce:

Citation

If you find Marco-o1 useful for your research and applications, please cite:

@misc{zhao2024marcoo1openreasoningmodels,
      title={Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions}, 
      author={Yu Zhao and Huifeng Yin and Bo Zeng and Hao Wang and Tianqi Shi and Chenyang Lyu and Longyue Wang and Weihua Luo and Kaifu Zhang},
      year={2024},
      eprint={2411.14405},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2411.14405}, 
}

@misc{yin2025wideningdistillationbottleneckreasoning,
      title={Marco o1 v2:Towards Widening The Distillation Bottleneck for Reasoning Models}, 
      author={Huifeng Yin and Yu Zhao and Minghao Wu and Xuanfan Ni and Bo Zeng and Hao Wang and Tianqi Shi and Liangying Shao and Chenyang Lyu and Longyue Wang and Weihua Luo and Kaifu Zhang},
      year={2025},
      eprint={2503.01461},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2503.01461}, 
}

@misc{zhang2026statetransitionframeworkefficientllm,
      title={A State-Transition Framework for Efficient LLM Reasoning}, 
      author={Liang Zhang and Yu Zhao and Longyue Wang and Tianqi Shi and Weihua Luo and Kaifu Zhang and Jinsong Su},
      year={2026},
      eprint={2602.01198},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2602.01198}, 
}

@misc{zhao2026difficultyestimatedpolicyoptimization,
      title={Difficulty-Estimated Policy Optimization}, 
      author={Yu Zhao and Fan Jiang and Tianle Liu and Bo Zeng and Yu Liu and Longyue Wang and Weihua Luo},
      year={2026},
      eprint={2602.06375},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2602.06375}, 
}

LICENSE

This project is licensed under Apache License Version 2 (SPDX-License-identifier: Apache-2.0).

DISCLAIMER

We used compliance checking algorithms during the training process, to ensure the compliance of the trained model and dataset to the best of our ability. Due to complex data and the diversity of language model usage scenarios, we cannot guarantee that the model is completely free of copyright issues or improper content. If you believe anything infringes on your rights or generates improper content, please contact us, and we will promptly address the matter.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🍓 Marco-o1: Towards Efficient Reasoning Models

🔥 News

⚡️ Released Resources

Codes and Models

Installation

Usage

👨🏻‍💻 Acknowledgement

Main Contributors

Citation

LICENSE

DISCLAIMER

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 202 Commits
assets		assets
data		data
src		src
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🍓 Marco-o1: Towards Efficient Reasoning Models

🔥 News

⚡️ Released Resources

Codes and Models

Installation

Usage

👨🏻‍💻 Acknowledgement

Main Contributors

Citation

LICENSE

DISCLAIMER

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages