An MCP Server for competitive programming problem creation, implementing the Validator-Generator-Checker framework from the AutoCode paper.
AutoCode MCP Server provides 14 atomic tools that enable AI assistants to create, validate, and test competitive programming problems. It handles compilation, execution, stress testing, and test data generation—letting the AI focus on problem design and solution logic.
- Validator-Generator-Checker Framework — Automated validation of input correctness, multi-strategy test generation, and output verification based on the AutoCode paper
- 14 Atomic Tools — File operations, solution building, stress testing, validator/generator/checker construction, and more
- testlib.h Support — Full integration with the competitive programming standard library for validators, generators, and checkers
- Multi-Strategy Generation — Four generation strategies: tiny (exhaustive), random, extreme (edge cases), and TLE-inducing
- Stress Testing — Automated comparison between optimal and brute-force solutions with configurable trial counts
- MCP Protocol — Native support for Claude Code, Cursor, and other MCP-compatible AI tools
- Safe Execution — Timeout control, memory limits (Linux), and temporary directory isolation
- Polygon Packaging — Export problems in Polygon format for Codeforces-style platforms
pip install autocode-mcpuv tool install autocode-mcpgit clone https://github.com/your-repo/autocode-mcp.git
cd autocode-mcp
uv sync- Python 3.14+
- g++ compiler with C++20 support (GCC 10+ recommended)
- testlib.h (included in templates/)
Verify your setup:
# Check Python version
python --version
# Check g++ version
g++ --version
# Run tests
uv run pytest tests/ -vAdd to your Claude Code configuration (~/.config/claude-code/config.json):
{
"mcpServers": {
"autocode": {
"command": "autocode-mcp"
}
}
}In Claude Code, simply ask:
"Create a competitive programming problem: Given two integers A and B, output their sum."
Claude will use AutoCode tools to:
- Generate problem statement
- Implement solutions (optimal + brute force)
- Build validator and generator
- Run stress tests
- Generate final test data
You can also call tools directly:
# Build a solution
solution_build(
problem_dir="problems/ab",
solution_type="sol",
code="#include <iostream>\nint main() { int a, b; std::cin >> a >> b; std::cout << a + b; }"
)
# Run stress test
stress_test_run(problem_dir="problems/ab", trials=100)Edit ~/.config/claude-code/config.json:
{
"mcpServers": {
"autocode": {
"command": "autocode-mcp"
}
}
}Add to your Cursor settings (Settings → MCP):
{
"mcp": {
"servers": {
"autocode": {
"command": "autocode-mcp"
}
}
}
}Edit ~/.config/opencode/opencode.json:
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"autocode": {
"type": "local",
"command": ["autocode-mcp"],
"enabled": true
}
}
}Or use uvx without pre-installation:
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"autocode": {
"type": "local",
"command": ["uvx", "autocode-mcp"],
"enabled": true
}
}
}For development or custom installations:
{
"mcpServers": {
"autocode": {
"command": "uv",
"args": ["run", "--directory", "/path/to/autocode-mcp", "autocode-mcp"]
}
}
}After configuration, restart your MCP client and check that tools are available. You should see 14 tools prefixed with autocode_.
AutoCode provides 14 atomic tools organized into 7 groups. All tools return a unified format:
{
"success": true,
"error": null,
"data": { ... }
}| Tool | Description | Key Parameters |
|---|---|---|
file_save |
Save content to a file | path, content |
file_read |
Read file content | path |
| Tool | Description | Key Parameters |
|---|---|---|
solution_build |
Compile solution code | problem_dir, solution_type ("sol"/"brute"), code |
solution_run |
Execute compiled solution | problem_dir, solution_type, input_data, timeout |
| Tool | Description | Key Parameters |
|---|---|---|
validator_build |
Build and test validator | problem_dir, code, test_cases |
validator_select |
Select best validator from candidates | candidates |
| Tool | Description | Key Parameters |
|---|---|---|
generator_build |
Compile generator | problem_dir, code |
generator_run |
Generate test inputs | problem_dir, strategies, test_count, validator_path |
| Tool | Description | Key Parameters |
|---|---|---|
checker_build |
Build output checker | problem_dir, code, test_scenarios |
| Tool | Description | Key Parameters |
|---|---|---|
interactor_build |
Build interactor for interactive problems | problem_dir, code, test_scenarios |
| Tool | Description | Key Parameters |
|---|---|---|
stress_test_run |
Compare sol vs brute outputs | problem_dir, trials, n_max, timeout |
| Tool | Description | Key Parameters |
|---|---|---|
problem_create |
Initialize problem directory | problem_dir, title, time_limit, memory_limit |
problem_generate_tests |
Generate final test data | problem_dir, test_count |
problem_pack_polygon |
Package for Polygon platform | problem_dir, output_dir |
This tutorial walks through creating a simple A+B problem using AutoCode tools.
problem_create(
problem_dir="problems/ab",
title="A + B",
time_limit=1000,
memory_limit=256
)Optimal Solution (sol.cpp):
#include <iostream>
int main() {
int a, b;
std::cin >> a >> b;
std::cout << a + b << std::endl;
return 0;
}Brute Force (brute.cpp):
#include <iostream>
int main() {
int a, b;
std::cin >> a >> b;
// Same as optimal for A+B, but could be slower for complex problems
std::cout << a + b << std::endl;
return 0;
}Build both:
solution_build(problem_dir="problems/ab", solution_type="sol", code="...")
solution_build(problem_dir="problems/ab", solution_type="brute", code="...")#include "testlib.h"
int main(int argc, char* argv[]) {
registerValidation(argc, argv);
int a = inf.readInt(-1000, 1000, "a");
inf.readSpace();
int b = inf.readInt(-1000, 1000, "b");
inf.readEoln();
inf.readEof();
return 0;
}Build with test cases:
validator_build(
problem_dir="problems/ab",
code="...",
test_cases=[
{"input": "1 2\n", "expected_valid": True},
{"input": "0 0\n", "expected_valid": True},
{"input": "-1000 1000\n", "expected_valid": True},
{"input": "1001 0\n", "expected_valid": False}, # out of range
{"input": "1 2 3\n", "expected_valid": False}, # extra number
]
)#include "testlib.h"
#include <iostream>
int main(int argc, char* argv[]) {
registerGen(argc, argv, 1);
int seed = atoi(argv[1]);
int type = atoi(argv[2]);
rnd.setSeed(seed);
int a = rnd.next(-1000, 1000);
int b = rnd.next(-1000, 1000);
std::cout << a << " " << b << std::endl;
return 0;
}Build and run:
generator_build(problem_dir="problems/ab", code="...")
generator_run(
problem_dir="problems/ab",
strategies=["random", "extreme"],
test_count=20,
validator_path="problems/ab/val.exe"
)stress_test_run(
problem_dir="problems/ab",
trials=1000,
n_max=100,
timeout=30
)Expected output:
All 1000 rounds passed
problem_generate_tests(
problem_dir="problems/ab",
test_count=50
)problem_pack_polygon(
problem_dir="problems/ab",
output_dir="polygon/ab"
)┌─────────────────┐
│ Problem Design │
└────────┬────────┘
│
▼
┌─────────────────┐ ┌──────────────┐
│ Solution Build │────►│ Validator │ Verify input constraints
│ (sol + brute) │ └──────────────┘
└────────┬────────┘
│
▼
┌─────────────────┐ ┌──────────────┐
│ Generator │────►│ Stress Test │ Compare sol vs brute
│ Multi-strategy │ └──────────────┘
└────────┬────────┘
│
▼
┌─────────────────┐
│ Checker │ Verify output correctness
└────────┬────────┘
│
▼
┌─────────────────┐
│ Polygon Pack │ Export for platforms
└─────────────────┘
-
Tool-Only, No LLM — Server provides compilation, execution, and validation. All code generation is done by the client LLM.
-
Stateless — Each tool call is independent. State is managed via
problem_dirparameter. -
Unified Return Format — All tools return
{success, error, data}for consistent error handling. -
Safe Execution — Timeout control, memory limits (Linux via prlimit), and temporary directory isolation.
| Strategy | Type Code | Purpose |
|---|---|---|
tiny |
1 | Small exhaustive tests (N ≤ 10) |
random |
2 | Random data within constraints |
extreme |
3 | Edge cases: overflow, precision, hash collisions |
tle |
4 | TLE-inducing data for performance testing |
problems/your-problem/
├── sol.cpp # Optimal solution
├── brute.cpp # Brute force (for validation)
├── val.cpp # Input validator
├── gen.cpp # Test generator
├── chk.cpp # Output checker (optional)
├── interactor.cpp # Interactor (for interactive problems)
├── statements/
│ └── README.md # Problem statement
├── tests/
│ ├── input/
│ └── output/
└── config.json # Problem configuration
git clone https://github.com/your-repo/autocode-mcp.git
cd autocode-mcp
uv sync# Run all tests
uv run pytest tests/ -v
# Run with coverage
uv run pytest tests/ --cov=src/autocode_mcp --cov-report=html
# Run specific test file
uv run pytest tests/test_compiler.py -v# Linting
uv run ruff check .
# Type checking
uv run mypy src/
# Format
uv run ruff format .autocode-mcp/
├── src/autocode_mcp/
│ ├── tools/ # MCP tool implementations
│ │ ├── base.py # Tool base class
│ │ ├── solution.py # Solution tools
│ │ ├── validator.py # Validator tools
│ │ ├── generator.py # Generator tools
│ │ ├── checker.py # Checker tools
│ │ ├── stress_test.py
│ │ └── ...
│ ├── utils/
│ │ ├── compiler.py # C++ compilation utilities
│ │ └── platform.py # Platform-specific helpers
│ ├── prompts/ # Workflow prompt templates
│ ├── resources/ # Template resources
│ └── server.py # MCP server entry point
├── templates/ # C++ templates (testlib.h, etc.)
├── tests/ # Test suite
└── pyproject.toml
- Create a new file in
src/autocode_mcp/tools/ - Inherit from
Toolbase class - Implement
name,description,input_schema, andexecute() - Register in
server.py - Add tests in
tests/
See CONTRIBUTING.md for guidelines.
See TROUBLESHOOTING.md for common issues and solutions.
MIT License - see LICENSE for details.
- Based on the paper "AutoCode: LLMs as Problem Setters for Competitive Programming"
- Uses testlib.h for competitive programming utilities
- Built on the Model Context Protocol