ROME (and related commands) is driven via the Hydra-based CLI in src/cli.py.
Single intervention:
python -m src.cli +command=rome model=gpt2-mediumBatch evaluation:
python -m src.cli +command=batch-rome model=gpt2-mediumCompute second-moment statistics (required before running ROME on a new model):
python -m src.cli +command=second-moment model=gpt2-mediumThe default config is at src/config/config.yaml. Override any value on the command line using Hydra syntax (e.g. model=gpt2-large).
Alternatively, use the console fallback (no Hydra overhead):
python -m src.cli --console rome --config src/config/config.yamlpython -m src.cli +command=causal-trace model=gpt2-mediumTo inspect the computed noise multiplier without running a full trace:
python -m src.cli +command=compute-multiplier model=gpt2-mediumstructural_benchmark.py applies ROME edits across a dataset and evaluates all structural detectors (MSD, blind MSD, spectral, IPR) on the modified weights. Results are written as JSON to analysis_out/.
python structural_benchmark.py \
--model gpt2-large \
--n-tests 30 \
--start-idx 0 \
--output-dir ./analysis_out \
--spectral-top-k 50 \
--trim-first-layers 2 \
--trim-last-layers 2 \
--spectral-neighbor-layers 1Key arguments:
| Argument | Default | Description |
|---|---|---|
--model |
gpt2-large |
Model name (must match a config in src/config/model/) |
--n-tests |
30 |
Number of ROME edits to benchmark |
--start-idx |
0 |
Starting index in the facts dataset |
--output-dir |
./analysis_out |
Directory for JSON result files |
--spectral-top-k |
50 |
Top-K singular values used by the spectral detector |
--trim-first-layers |
2 |
Layers to exclude from the head of the model |
--trim-last-layers |
2 |
Layers to exclude from the tail of the model |
--n-prompts |
auto | Number of ROME prefix prompts (scales with model size if omitted) |
Detailed documentation for the detection methods is in the docs/ directory:
docs/structural-docs.md- structural detector metrics (L2 discrepancy, relative discrepancy, directional coherence, MSD, IPR, etc.)docs/spectral-docs.md- spectral detector signals and the mathematics behind singular-value z-scores and ratio scores
| Supported Models | Causal Trace | Weight intervention | Notes |
|---|---|---|---|
| gpt2-medium | ✔️ | ✔️ | |
| gpt2-large | ✔️ | ✔️ | |
| gpt2-xl | ✔️ | ✔️ | |
| gpt-j-6b | ✔️ | ✔️ | |
| qwen3-0.6b | ✔️ | ✔️ | |
| qwen3-1.7b | ✔️ | ✔️ | |
| qwen3-4b | ✔️ | ✔️ | |
| qwen3-8b | ✔️ | ✔️ | |
| granite4-micro | ✔️ |
| Error code | Name of the error | Description |
|---|---|---|
1 |
Help | Help invoked. Typically caused by incorrect script usage. |
2 |
Resource already exists | Trying to create a resource that already exists. |
-1 |
Unknown | An unknow error. Create GitHub issue with the reproduction steps |