Skip to content

Create benchmark save files for all 6 scenarios #7

@jkbennitt

Description

@jkbennitt

Context

The paired benchmark (agent vs baseline) requires reproducible initial conditions via named save files. Each scenario needs a RimWorld save at a specific colony state. rle_crashlanded_v1 exists — 5 more needed.

Save Files to Create

All saves start from a Crashlanded scenario, Cassandra Classic, Adventure Story difficulty. Use dev mode to advance time and trigger events.

1. rle_crashlanded_v1 — DONE

  • Day 1, 3 colonists, default Crashlanded start
  • Already created and tested (paired benchmark delta: -0.029)

2. rle_first_winter_v1

  • Advance to day 30 (approaching fall/winter)
  • Colony should have basic shelter, some food stored, a few research projects done
  • Dev mode: Development > Date > Set day or just fast-forward
  • Save when the season is about to change

3. rle_toxic_fallout_v1

  • Advance to day 10, stable colony
  • Trigger toxic fallout: Dev mode > Debug actions > Incidents > Execute incident > ToxicFallout
  • Save immediately after the fallout starts (green overlay visible)
  • Tests: can agents keep colonists indoors, manage food, survive the event?

4. rle_raid_defense_v1

  • Advance to day 15, build some walls/sandbags
  • Trigger a raid: Dev mode > Debug actions > Incidents > Execute incident > RaidEnemy
  • Save right before or as the raid spawns
  • Tests: can DefenseCommander draft colonists and position them?

5. rle_plague_response_v1

  • Advance to day 10, have some medicine stockpiled
  • Trigger plague: Dev mode > Debug actions > Incidents > Execute incident > Plague
  • Save immediately after plague hits (colonists should show "plague" hediff)
  • Tests: can MedicalOfficer triage, assign bed rest, administer medicine?

6. rle_ship_launch_v1

  • Advance to day 60 with significant research progress
  • Complete several research projects via dev mode: Debug actions > Research > Finish project
  • Have 5+ colonists (use Debug actions > Spawn pawn > Colonist)
  • Save with a mid-game colony that has resources + tech to attempt ship building
  • Tests: long-horizon planning, research prioritization, resource management at scale

How to Create Each Save

  1. Load rle_crashlanded_v1 (base save)
  2. Enable dev mode: Options > check "Development mode"
  3. Use dev tools to advance time / trigger events per scenario above
  4. Save as the exact name listed (e.g. rle_first_winter_v1)
  5. Verify: curl http://localhost:8765/api/v1/game/state shows expected tick/colonist count

How to Verify Saves Work with Benchmark

# Test save/load roundtrip
python -c "
import asyncio
from rle.rimapi.client import RimAPIClient
async def test():
    async with RimAPIClient('http://localhost:8765') as c:
        await c.load_game('rle_first_winter_v1')
        import time; await asyncio.sleep(3)
        state = await c.get_game_state()
        print(f'Day: {state.colony.day}, Pop: {state.colony.population}')
asyncio.run(test())
"

Priority

Do rle_first_winter_v1 and rle_raid_defense_v1 first — these are the scenarios most likely to show agent value (agents managing food/shelter before winter, agents drafting defenders during raids). The others can wait.

@CalebisGross — if you have RimWorld installed you can create some of these too. Just name the saves exactly as listed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions