-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Context
The paired benchmark (agent vs baseline) requires reproducible initial conditions via named save files. Each scenario needs a RimWorld save at a specific colony state. rle_crashlanded_v1 exists — 5 more needed.
Save Files to Create
All saves start from a Crashlanded scenario, Cassandra Classic, Adventure Story difficulty. Use dev mode to advance time and trigger events.
1. rle_crashlanded_v1 — DONE
- Day 1, 3 colonists, default Crashlanded start
- Already created and tested (paired benchmark delta: -0.029)
2. rle_first_winter_v1
- Advance to day 30 (approaching fall/winter)
- Colony should have basic shelter, some food stored, a few research projects done
- Dev mode:
Development > Date > Set dayor just fast-forward - Save when the season is about to change
3. rle_toxic_fallout_v1
- Advance to day 10, stable colony
- Trigger toxic fallout: Dev mode >
Debug actions > Incidents > Execute incident > ToxicFallout - Save immediately after the fallout starts (green overlay visible)
- Tests: can agents keep colonists indoors, manage food, survive the event?
4. rle_raid_defense_v1
- Advance to day 15, build some walls/sandbags
- Trigger a raid: Dev mode >
Debug actions > Incidents > Execute incident > RaidEnemy - Save right before or as the raid spawns
- Tests: can DefenseCommander draft colonists and position them?
5. rle_plague_response_v1
- Advance to day 10, have some medicine stockpiled
- Trigger plague: Dev mode >
Debug actions > Incidents > Execute incident > Plague - Save immediately after plague hits (colonists should show "plague" hediff)
- Tests: can MedicalOfficer triage, assign bed rest, administer medicine?
6. rle_ship_launch_v1
- Advance to day 60 with significant research progress
- Complete several research projects via dev mode:
Debug actions > Research > Finish project - Have 5+ colonists (use
Debug actions > Spawn pawn > Colonist) - Save with a mid-game colony that has resources + tech to attempt ship building
- Tests: long-horizon planning, research prioritization, resource management at scale
How to Create Each Save
- Load
rle_crashlanded_v1(base save) - Enable dev mode: Options > check "Development mode"
- Use dev tools to advance time / trigger events per scenario above
- Save as the exact name listed (e.g.
rle_first_winter_v1) - Verify:
curl http://localhost:8765/api/v1/game/stateshows expected tick/colonist count
How to Verify Saves Work with Benchmark
# Test save/load roundtrip
python -c "
import asyncio
from rle.rimapi.client import RimAPIClient
async def test():
async with RimAPIClient('http://localhost:8765') as c:
await c.load_game('rle_first_winter_v1')
import time; await asyncio.sleep(3)
state = await c.get_game_state()
print(f'Day: {state.colony.day}, Pop: {state.colony.population}')
asyncio.run(test())
"Priority
Do rle_first_winter_v1 and rle_raid_defense_v1 first — these are the scenarios most likely to show agent value (agents managing food/shelter before winter, agents drafting defenders during raids). The others can wait.
@CalebisGross — if you have RimWorld installed you can create some of these too. Just name the saves exactly as listed.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels