feat: add verifiable fine-tuning step on deterministic training baseline#86
feat: add verifiable fine-tuning step on deterministic training baseline#86ryoari wants to merge 7 commits intoAOSSIE-Org:mainfrom
Conversation
WalkthroughAdded a verifiable fine-tuning PoC under Changes
Sequence Diagram(s)sequenceDiagram
participant User as User
participant Base as BaseTraining (train_base.py)
participant Fine as FineTuner (finetune.py)
participant Manifest as Verifier (manifest.py)
participant Utils as Utils (utils.py)
participant Files as Filesystem
User->>Base: run train_base.py
Base->>Utils: set_seed(...) / build model
Base->>Utils: save_deterministic(state_dict)
Utils->>Files: write `base_checkpoint.pt`
Utils-->>Base: return base_checkpoint_hash
Base->>Utils: update_manifest("base", {...})
Utils->>Files: write/update `manifest.json`
Base-->>User: print base checkpoint hash
User->>Fine: run finetune.py
Fine->>Files: read `base_checkpoint.pt`
Fine->>Utils: hash base checkpoint
Fine->>Utils: mutate model deterministically
Fine->>Utils: save_deterministic(finetuned state)
Utils->>Files: write `finetuned_checkpoint.pt`
Utils-->>Fine: return finetuned_checkpoint_hash
Fine->>Utils: update_manifest("finetune", {...})
Fine-->>User: print finetune checkpoint hash and match result
User->>Manifest: run manifest.py
Manifest->>Files: read `manifest.json`
Manifest->>Files: hash `base_checkpoint.pt`
Manifest->>Manifest: compare to manifest["base"]["checkpoint_hash"]
Manifest-->>User: print base match result
Manifest->>Files: hash `finetuned_checkpoint.pt`
Manifest->>Manifest: compare to manifest["finetune"]["checkpoint_hash"]
Manifest-->>User: print finetuned match result
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 11
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@experiments/verifiable_finetuning/finetune.py`:
- Around line 41-42: The second print unconditionally prints "FINE MATCH" which
can mislabel mismatches; change the logic in finetune.py to compute
hash_file(ft_ckpt) and compare it to ft_hash (use the ft_hash variable and the
hash_file and ft_ckpt calls) and only print the success message when they are
equal, otherwise print a clear failure message (including both values) so
mismatches are not silently labeled as matches.
- Around line 24-27: The code is mutating param.data directly; instead, inside
the existing torch.no_grad() block update parameters with an in-place tensor op
that preserves autograd invariants (e.g., replace "param.data += 0.001" with
"param.add_(0.001)" or another in-place method) while iterating
model.parameters() so you do not bypass autograd via .data.
- Line 21: Update the checkpoint load call used by model.load_state_dict to use
safe unpickling and explicit device mapping: when loading base_ckpt with
torch.load (the argument passed into model.load_state_dict), pass
weights_only=True and map_location="cpu" (or map to the target device) so only
tensor/primitive types are unpickled and tensors are pinned to a known device;
update the call site where torch.load(base_ckpt) is used (the value fed to
model.load_state_dict) accordingly and ensure compatibility with PyTorch
>=2.8.0.
In `@experiments/verifiable_finetuning/manifest.py`:
- Line 15: The printed verification title has a typo: change the string in the
print statement that currently reads " End to End Plipeline Verification \n" to
" End to End Pipeline Verification \n" so the user-facing message correctly
spells "Pipeline" (locate and update the print call that outputs the
verification title).
- Around line 17-29: The verification currently prints results but doesn't fail
CI or handle missing files/keys; update the block that uses hash_file and
manifest (references: hash_file, manifest, "base_checkpoint.pt",
"finetuned_checkpoint.pt") to be fail-fast: validate presence of
manifest["base"]["checkpoint_hash"] and manifest["finetune"]["checkpoint_hash"],
catch file-not-found/key errors and log a clear error, compute both hashes, and
if either actual != expected call sys.exit(1) (or raise SystemExit) after
printing the mismatch so the process returns non-zero; ensure any unexpected
exceptions are surfaced (or logged) rather than swallowed so CI fails loudly.
In `@experiments/verifiable_finetuning/README.md`:
- Around line 3-10: The README headings and code fence need Markdown lint fixes:
update the "What this proves" and "How to run" headings to standard Markdown
(remove stray quotes around phrases and ensure a blank line above each heading),
add a blank line before and after the fenced code block, and specify the
code-fence language (bash) so the block around the three commands (python
train_base.py, python finetune.py, python manifest.py) is formatted correctly;
check the heading text for extra trailing/leading spaces and remove them to
satisfy MD001/MD022/MD031.
In `@experiments/verifiable_finetuning/train_base.py`:
- Around line 29-30: The second print unconditionally appends "CORRECT MATCH"
even though no comparison is made; update the verification to compute
hash_file(ckpt_path), compare it to ckpt_hash, and print a clear message
reflecting the result (e.g., "CORRECT MATCH" only if hash_file(ckpt_path) ==
ckpt_hash, otherwise "MISMATCH" with both values). Locate the prints around the
variables ckpt_hash and ckpt_path and the hash_file(…) call in train_base.py and
replace the unconditional message with this conditional comparison and concise
outcome.
- Around line 25-27: The manifest currently stores a static label under
"dataset_hash" instead of a cryptographic fingerprint; compute a real SHA-256
(or similar) digest of the generated dataset tensors/bytes (the object created
earlier in the script that holds the synthetic data) before calling
update_manifest("base", ...), convert to a stable hex string, and pass that hex
digest as the dataset_hash value to update_manifest so the manifest
cryptographically binds the exact dataset used (use the same deterministic
ordering/serialization of the tensors when hashing to ensure repeatability given
seed 99).
In `@experiments/verifiable_finetuning/utils.py`:
- Around line 1-6: The import block in utils.py mixes stdlib and third-party
imports; reorder and group them so stdlib imports (hashlib, json, os, random)
appear first, followed by a blank line, then third-party imports (numpy as np,
torch), keeping names as in the diff so Ruff's import grouping rule passes.
- Around line 8-9: The module currently calls os.chdir(SCRIPT_DIR) at import
time which mutates global process state; remove that call and instead use
SCRIPT_DIR for explicit path construction where needed (e.g., join SCRIPT_DIR
with filenames in callers). Add a small helper like get_script_dir() or expose
SCRIPT_DIR constant and update call sites to use os.path.join(SCRIPT_DIR, ...)
rather than relying on changing the working directory; ensure no other code in
this module or tests depends on cwd mutation before removing
os.chdir(SCRIPT_DIR).
- Around line 11-16: In set_seed, enforce strict deterministic behavior by
removing the warn_only=True argument from the torch.use_deterministic_algorithms
call (i.e., call torch.use_deterministic_algorithms(True) so nondeterministic
ops raise errors); update the set_seed function to call
torch.use_deterministic_algorithms(True) to ensure strict determinism for
reproducible, verifiable training.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0ff47ffb-3963-426b-9814-e4c2ab1acd0e
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (8)
experiments/verifiable_finetuning/.gitignoreexperiments/verifiable_finetuning/README.mdexperiments/verifiable_finetuning/finetune.pyexperiments/verifiable_finetuning/manifest.pyexperiments/verifiable_finetuning/train_base.pyexperiments/verifiable_finetuning/utils.pyopenverifiablellm/verify.pypyproject.toml
| with torch.no_grad(): | ||
| for param in model.parameters(): | ||
| param.data += 0.001 | ||
|
|
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Avoid param.data mutation; use in-place ops under no_grad.
param.data bypasses autograd internals in a non-idiomatic way.
✅ Idiomatic update
with torch.no_grad():
for param in model.parameters():
- param.data += 0.001
+ param.add_(0.001)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| with torch.no_grad(): | |
| for param in model.parameters(): | |
| param.data += 0.001 | |
| with torch.no_grad(): | |
| for param in model.parameters(): | |
| param.add_(0.001) | |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@experiments/verifiable_finetuning/finetune.py` around lines 24 - 27, The code
is mutating param.data directly; instead, inside the existing torch.no_grad()
block update parameters with an in-place tensor op that preserves autograd
invariants (e.g., replace "param.data += 0.001" with "param.add_(0.001)" or
another in-place method) while iterating model.parameters() so you do not bypass
autograd via .data.
| with open("manifest.json", "r") as f: | ||
| manifest = json.load(f) | ||
|
|
||
| print(" End to End Plipeline Verification \n") |
There was a problem hiding this comment.
Fix user-facing typo in the verification title.
“Plipeline” should be “Pipeline”.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@experiments/verifiable_finetuning/manifest.py` at line 15, The printed
verification title has a typo: change the string in the print statement that
currently reads " End to End Plipeline Verification \n" to " End to End Pipeline
Verification \n" so the user-facing message correctly spells "Pipeline" (locate
and update the print call that outputs the verification title).
| base_actual = hash_file("base_checkpoint.pt") | ||
| base_expected = manifest["base"]["checkpoint_hash"] | ||
| base_match = "BINGO" if base_actual == base_expected else "NUH-UH" | ||
| print(f"Base expected: {base_expected}") | ||
| print(f"Base actual : {base_actual} {base_match}\n") | ||
|
|
||
| # Verify Finetune | ||
| ft_actual = hash_file("finetuned_checkpoint.pt") | ||
| ft_expected = manifest["finetune"]["checkpoint_hash"] | ||
| ft_match = "BINGO" if ft_actual == ft_expected else "NUH-UH" | ||
| print(f"Finetune expected: {ft_expected}") | ||
| print(f"Finetune actual : {ft_actual} {ft_match}") | ||
|
|
There was a problem hiding this comment.
Make verification fail-fast and CI-friendly.
The current flow can raise unhandled errors on missing files/keys and still exits successfully on hash mismatches. Verification should return non-zero when checks fail.
✅ Robust verification pattern
+import sys
@@
def verify():
@@
- base_actual = hash_file("base_checkpoint.pt")
- base_expected = manifest["base"]["checkpoint_hash"]
+ required = [("base", "base_checkpoint.pt"), ("finetune", "finetuned_checkpoint.pt")]
+ for stage, ckpt in required:
+ if stage not in manifest or "checkpoint_hash" not in manifest[stage]:
+ print(f"Missing manifest entry: {stage}.checkpoint_hash")
+ return 1
+ if not os.path.exists(ckpt):
+ print(f"Missing checkpoint file: {ckpt}")
+ return 1
+
+ base_actual = hash_file("base_checkpoint.pt")
+ base_expected = manifest["base"]["checkpoint_hash"]
@@
- print(f"Finetune actual : {ft_actual} {ft_match}")
+ print(f"Finetune actual : {ft_actual} {ft_match}")
+ return 0 if (base_actual == base_expected and ft_actual == ft_expected) else 1
@@
if __name__ == "__main__":
- verify()
+ sys.exit(verify())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@experiments/verifiable_finetuning/manifest.py` around lines 17 - 29, The
verification currently prints results but doesn't fail CI or handle missing
files/keys; update the block that uses hash_file and manifest (references:
hash_file, manifest, "base_checkpoint.pt", "finetuned_checkpoint.pt") to be
fail-fast: validate presence of manifest["base"]["checkpoint_hash"] and
manifest["finetune"]["checkpoint_hash"], catch file-not-found/key errors and log
a clear error, compute both hashes, and if either actual != expected call
sys.exit(1) (or raise SystemExit) after printing the mismatch so the process
returns non-zero; ensure any unexpected exceptions are surfaced (or logged)
rather than swallowed so CI fails loudly.
There was a problem hiding this comment.
we don't need such robust verification, for such a simple experiment
There was a problem hiding this comment.
@ryoari, understood! Given this is a minimal PoC to demonstrate reproducibility rather than a production-grade pipeline, keeping the verification simple and straightforward makes sense. I'll defer to your judgment here.
(╯°▽°)╯
| update_manifest( | ||
| "base", {"seed": 99, "dataset_hash": "synthetic_16x10_seed99", "checkpoint_hash": ckpt_hash} | ||
| ) |
There was a problem hiding this comment.
dataset_hash is not actually a hash.
Line 26 stores a static label, so dataset integrity is not cryptographically bound in the manifest.
✅ Example fix (hash the generated tensors)
+ dataset_bytes = X.detach().cpu().numpy().tobytes() + y.detach().cpu().numpy().tobytes()
+ dataset_hash = __import__("hashlib").sha256(dataset_bytes).hexdigest()
update_manifest(
- "base", {"seed": 99, "dataset_hash": "synthetic_16x10_seed99", "checkpoint_hash": ckpt_hash}
+ "base", {"seed": 99, "dataset_hash": dataset_hash, "checkpoint_hash": ckpt_hash}
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@experiments/verifiable_finetuning/train_base.py` around lines 25 - 27, The
manifest currently stores a static label under "dataset_hash" instead of a
cryptographic fingerprint; compute a real SHA-256 (or similar) digest of the
generated dataset tensors/bytes (the object created earlier in the script that
holds the synthetic data) before calling update_manifest("base", ...), convert
to a stable hex string, and pass that hex digest as the dataset_hash value to
update_manifest so the manifest cryptographically binds the exact dataset used
(use the same deterministic ordering/serialization of the tensors when hashing
to ensure repeatability given seed 99).
| print(f"Base run hash: {ckpt_hash}") | ||
| print(f"Base run hash (again): {hash_file(ckpt_path)} CORRECT MATCH") |
There was a problem hiding this comment.
Avoid unconditional “match” output.
Line 30 always prints CORRECT MATCH without comparing values, which can hide real mismatches.
✅ Verification output fix
print(f"Base run hash: {ckpt_hash}")
- print(f"Base run hash (again): {hash_file(ckpt_path)} CORRECT MATCH")
+ actual_hash = hash_file(ckpt_path)
+ status = "CORRECT MATCH" if actual_hash == ckpt_hash else "MISMATCH"
+ print(f"Base run hash (again): {actual_hash} {status}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| print(f"Base run hash: {ckpt_hash}") | |
| print(f"Base run hash (again): {hash_file(ckpt_path)} CORRECT MATCH") | |
| print(f"Base run hash: {ckpt_hash}") | |
| actual_hash = hash_file(ckpt_path) | |
| status = "CORRECT MATCH" if actual_hash == ckpt_hash else "MISMATCH" | |
| print(f"Base run hash (again): {actual_hash} {status}") |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@experiments/verifiable_finetuning/train_base.py` around lines 29 - 30, The
second print unconditionally appends "CORRECT MATCH" even though no comparison
is made; update the verification to compute hash_file(ckpt_path), compare it to
ckpt_hash, and print a clear message reflecting the result (e.g., "CORRECT
MATCH" only if hash_file(ckpt_path) == ckpt_hash, otherwise "MISMATCH" with both
values). Locate the prints around the variables ckpt_hash and ckpt_path and the
hash_file(…) call in train_base.py and replace the unconditional message with
this conditional comparison and concise outcome.
| def set_seed(seed=99): | ||
| random.seed(seed) | ||
| np.random.seed(seed) | ||
| torch.manual_seed(seed) | ||
| torch.use_deterministic_algorithms(True, warn_only=True) | ||
| torch.set_default_device("cpu") |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In PyTorch, does torch.use_deterministic_algorithms(True, warn_only=True) allow nondeterministic operations to run with warnings instead of raising an error?
💡 Result:
Yes, torch.use_deterministic_algorithms(True, warn_only=True) allows nondeterministic operations to run while issuing warnings instead of raising a RuntimeError.
Citations:
- 1: Set warning or error with
use_deterministic_algorithmspytorch/pytorch#64883 - 2: https://docs.pytorch.org/docs/2.0/generated/torch.use_deterministic_algorithms.html
🏁 Script executed:
# Find where set_seed is called
rg "set_seed" experiments/verifiable_finetuning/ -n
# Look for any nondeterministic operations or related comments
rg -i "nondeterministic|deterministic|warn" experiments/verifiable_finetuning/ -n
# Check if there are any known issues with nondeterministic operations
rg -i "warn_only|use_deterministic" -A 3 -B 3 experiments/verifiable_finetuning/Repository: AOSSIE-Org/OpenVerifiableLLM
Length of output: 2104
🏁 Script executed:
# Understand the purpose and context of the verification code
fd "\.py$" experiments/verifiable_finetuning/ | head -20 | xargs wc -l | tail -1Repository: AOSSIE-Org/OpenVerifiableLLM
Length of output: 82
🏁 Script executed:
# Check imports and overall file structure
head -30 experiments/verifiable_finetuning/utils.pyRepository: AOSSIE-Org/OpenVerifiableLLM
Length of output: 729
🏁 Script executed:
# Check for comments or git history around this parameter
git log -p --all -S "warn_only" -- experiments/verifiable_finetuning/utils.py 2>/dev/null | head -50Repository: AOSSIE-Org/OpenVerifiableLLM
Length of output: 54
🏁 Script executed:
# Check for any exception handling or try-except around set_seed calls
rg "try:|except:|set_seed" experiments/verifiable_finetuning/ -B 2 -A 2Repository: AOSSIE-Org/OpenVerifiableLLM
Length of output: 1623
🏁 Script executed:
# Look for any TODOs or comments in the file
grep -n "TODO\|FIXME\|NOTE\|XXX\|#" experiments/verifiable_finetuning/utils.pyRepository: AOSSIE-Org/OpenVerifiableLLM
Length of output: 54
Remove warn_only=True to enforce strict determinism.
The warn_only=True parameter allows nondeterministic operations to proceed with warnings instead of raising errors. Given that this project's core objective is verifiable and reproducible training with cryptographic verification of model states, nondeterministic operations silently degrade the verification guarantee.
Suggested fix
- torch.use_deterministic_algorithms(True, warn_only=True)
+ torch.use_deterministic_algorithms(True)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@experiments/verifiable_finetuning/utils.py` around lines 11 - 16, In
set_seed, enforce strict deterministic behavior by removing the warn_only=True
argument from the torch.use_deterministic_algorithms call (i.e., call
torch.use_deterministic_algorithms(True) so nondeterministic ops raise errors);
update the set_seed function to call torch.use_deterministic_algorithms(True) to
ensure strict determinism for reproducible, verifiable training.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@experiments/verifiable_finetuning/utils.py`:
- Around line 21-25: The hash_file function currently reads entire file into
memory via f.read(), which will OOM on large checkpoints; modify hash_file to
read the file in fixed-size chunks (e.g., 64KB or 1MB) in a loop and call
hasher.update(chunk) for each chunk until EOF, keeping the rest of the function
(hasher = hashlib.sha256(), open(file_path, "rb"), return hasher.hexdigest())
the same so it streams and hashes arbitrarily large files without loading them
entirely.
- Around line 33-42: The update_manifest function currently assumes
manifest_path contains valid JSON; if json.load raises json.JSONDecodeError the
function will crash with a confusing error. Wrap the open+json.load block in a
try/except catching json.JSONDecodeError (and optionally IOError), and on decode
error either reinitialize manifest = {} or raise a clearer exception mentioning
manifest_path and the original error; ensure you still write the new
manifest[stage] = data and that the file is overwritten when recovering.
Reference: function update_manifest, variable manifest_path, exception
json.JSONDecodeError.
- Around line 28-30: The save_deterministic function currently uses the internal
parameter _use_new_zipfile_serialization which is undocumented and may be
removed; update save_deterministic to guard or document this: check
torch.__version__ (or use packaging.version.parse) and only pass
_use_new_zipfile_serialization when running a PyTorch version known to support
it (e.g., 2.11), otherwise fall back to calling torch.save without that kwarg or
use an alternate deterministic serialization approach, and add a short
comment/docstring near save_deterministic describing the version requirement and
why the branch exists; ensure hash_file is still called on the saved path in
both branches.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 696fb768-980c-43f7-97ad-a207582bcc3c
📒 Files selected for processing (1)
experiments/verifiable_finetuning/utils.py
| def hash_file(file_path): | ||
| hasher = hashlib.sha256() | ||
| with open(file_path, "rb") as f: | ||
| hasher.update(f.read()) | ||
| return hasher.hexdigest() |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Use chunked reading for large checkpoint files.
f.read() loads the entire file into memory. This works for the tiny model POC but will cause memory issues with real-world checkpoints (often GBs).
♻️ Chunked hashing for large files
def hash_file(file_path):
hasher = hashlib.sha256()
with open(file_path, "rb") as f:
- hasher.update(f.read())
+ for chunk in iter(lambda: f.read(8192), b""):
+ hasher.update(chunk)
return hasher.hexdigest()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@experiments/verifiable_finetuning/utils.py` around lines 21 - 25, The
hash_file function currently reads entire file into memory via f.read(), which
will OOM on large checkpoints; modify hash_file to read the file in fixed-size
chunks (e.g., 64KB or 1MB) in a loop and call hasher.update(chunk) for each
chunk until EOF, keeping the rest of the function (hasher = hashlib.sha256(),
open(file_path, "rb"), return hasher.hexdigest()) the same so it streams and
hashes arbitrarily large files without loading them entirely.
| def save_deterministic(state_dict, path): | ||
| torch.save(state_dict, path, _use_new_zipfile_serialization=False) | ||
| return hash_file(path) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
PyTorch torch.save _use_new_zipfile_serialization parameter deprecated removed
💡 Result:
The _use_new_zipfile_serialization parameter in torch.save is an internal (underscore-prefixed) option introduced around PyTorch 1.6 to control whether to use the new zipfile-based serialization format (True, default since 1.6) or the legacy pickle-based format (False). It is not officially documented in the torch.save API reference pages across versions including 2.11. No evidence found in official documentation, source code snippets, release notes (up to 2.11.0 as of 2026-03-28), or GitHub issues/PRs indicating it has been formally deprecated or removed. The parameter still appears in serialization source code comments referencing PyTorch 1.6 behavior and is mentioned in legacy docs (e.g., 1.6.0 notes). Current serialization docs (stable/main) describe the zip format as default since 1.6 without mentioning the parameter, suggesting it remains supported for backward compatibility to produce legacy files but is unneeded for default behavior. As an undocumented internal parameter, its presence is not guaranteed across versions, but it has not been removed up to the latest PyTorch 2.11.
Citations:
- 1: https://pytorch.org/docs/stable/_modules/torch/serialization.html
- 2: https://docs.pytorch.org/docs/stable/generated/torch.save.html
- 3: https://docs.pytorch.org/docs/1.6.0/notes/serialization.html
- 4: https://docs.pytorch.org/docs/stable/notes/serialization.html
- 5: https://docs.pytorch.org/docs/2.8/_sources/notes/serialization.rst.txt
- 6: https://pytorch.org/docs/stable/notes/serialization.html
- 7: https://docs.pytorch.org/docs/2.0/notes/serialization.html
- 8: https://docs.pytorch.org/docs/2.5/_modules/torch/serialization.html
- 9: https://pytorch.org/docs/stable/generated/torch.save.html
- 10: http://docs.pytorch.org/docs/main/notes/serialization.html
Address undocumented internal parameter for deterministic serialization.
The _use_new_zipfile_serialization parameter is undocumented and internal (underscore-prefixed), with no official stability guarantees. While currently supported in PyTorch 2.11, relying on undocumented parameters creates fragility—future versions may remove it or change its behavior without notice, potentially breaking hash reproducibility across environments. Consider documenting the PyTorch version constraints or adding a version check to ensure consistent serialization behavior, or evaluate alternative approaches to deterministic saving if the parameter is not available.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@experiments/verifiable_finetuning/utils.py` around lines 28 - 30, The
save_deterministic function currently uses the internal parameter
_use_new_zipfile_serialization which is undocumented and may be removed; update
save_deterministic to guard or document this: check torch.__version__ (or use
packaging.version.parse) and only pass _use_new_zipfile_serialization when
running a PyTorch version known to support it (e.g., 2.11), otherwise fall back
to calling torch.save without that kwarg or use an alternate deterministic
serialization approach, and add a short comment/docstring near
save_deterministic describing the version requirement and why the branch exists;
ensure hash_file is still called on the saved path in both branches.
| def update_manifest(stage, data, manifest_path="manifest.json"): | ||
| manifest = {} | ||
| if os.path.exists(manifest_path): | ||
| with open(manifest_path, "r") as f: | ||
| manifest = json.load(f) | ||
|
|
||
| manifest[stage] = data | ||
|
|
||
| with open(manifest_path, "w") as f: | ||
| json.dump(manifest, f, indent=4) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider adding error handling for malformed JSON.
If manifest.json exists but contains invalid JSON, json.load() will raise JSONDecodeError with a potentially confusing error. For robustness, consider catching this and either re-initializing or providing a clearer error message.
♻️ Optional: Add error handling
def update_manifest(stage, data, manifest_path="manifest.json"):
manifest = {}
if os.path.exists(manifest_path):
- with open(manifest_path, "r") as f:
- manifest = json.load(f)
+ try:
+ with open(manifest_path, "r") as f:
+ manifest = json.load(f)
+ except json.JSONDecodeError:
+ print(f"Warning: {manifest_path} is malformed, reinitializing.")
+ manifest = {}
manifest[stage] = data
with open(manifest_path, "w") as f:
json.dump(manifest, f, indent=4)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@experiments/verifiable_finetuning/utils.py` around lines 33 - 42, The
update_manifest function currently assumes manifest_path contains valid JSON; if
json.load raises json.JSONDecodeError the function will crash with a confusing
error. Wrap the open+json.load block in a try/except catching
json.JSONDecodeError (and optionally IOError), and on decode error either
reinitialize manifest = {} or raise a clearer exception mentioning manifest_path
and the original error; ensure you still write the new manifest[stage] = data
and that the file is overwritten when recovering. Reference: function
update_manifest, variable manifest_path, exception json.JSONDecodeError.
…e pro ject metadata to use contributor-based authorship
There was a problem hiding this comment.
♻️ Duplicate comments (7)
experiments/verifiable_finetuning/utils.py (3)
15-19:⚠️ Potential issue | 🟠 MajorEnforce strict determinism in
set_seed.Line 18 uses
warn_only=True, which allows nondeterministic ops to proceed and weakens reproducibility guarantees for this verification flow.Suggested fix
def set_seed(seed=42): """Locks in deterministic behavior.""" torch.manual_seed(seed) - torch.use_deterministic_algorithms(True, warn_only=True) + torch.use_deterministic_algorithms(True)In current PyTorch docs, does torch.use_deterministic_algorithms(True, warn_only=True) allow nondeterministic operations to continue with warnings instead of raising?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@experiments/verifiable_finetuning/utils.py` around lines 15 - 19, The set_seed function currently calls torch.use_deterministic_algorithms(True, warn_only=True) which only warns on nondeterministic ops; change this to enforce strict determinism by calling torch.use_deterministic_algorithms(True, warn_only=False) so PyTorch raises on nondeterministic ops, and also make the function fully deterministic by seeding CUDA (torch.cuda.manual_seed_all) and setting backend flags (torch.backends.cudnn.deterministic = True and torch.backends.cudnn.benchmark = False) in the set_seed implementation.
29-32:⚠️ Potential issue | 🟡 MinorGuard usage of internal
torch.saveparameter.Line 31 depends on
_use_new_zipfile_serialization, an internal/underscore API with no compatibility guarantee across versions.Suggested fix
def save_deterministic(state_dict, path): """Saves state dict without zip metadata to ensure identical hashes.""" - torch.save(state_dict, path, _use_new_zipfile_serialization=False) + try: + torch.save(state_dict, path, _use_new_zipfile_serialization=False) + except TypeError as e: + raise RuntimeError( + "This reproducibility flow requires torch.save support for " + "_use_new_zipfile_serialization=False." + ) from e return hash_file(path)Is _use_new_zipfile_serialization an officially documented/stable torch.save parameter in latest PyTorch, and is its behavior guaranteed across releases?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@experiments/verifiable_finetuning/utils.py` around lines 29 - 32, The use of the internal kwarg _use_new_zipfile_serialization in save_deterministic is unsafe across PyTorch versions; update save_deterministic to detect support for that parameter (e.g., inspect.signature(torch.save) or checking torch.__version__ / attribute presence) and call torch.save with _use_new_zipfile_serialization only when supported, otherwise fall back to calling torch.save without that kwarg; ensure the function still returns hash_file(path) and keep the function name save_deterministic unchanged.
21-26: 🧹 Nitpick | 🔵 TrivialHash files in chunks to avoid high memory usage.
Line 25 reads the entire checkpoint into memory; this will not scale once checkpoints grow.
Suggested fix
def hash_file(filepath): """Generates a SHA-256 hash of a file.""" hasher = hashlib.sha256() with open(filepath, "rb") as f: - hasher.update(f.read()) + for chunk in iter(lambda: f.read(1024 * 1024), b""): + hasher.update(chunk) return hasher.hexdigest()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@experiments/verifiable_finetuning/utils.py` around lines 21 - 26, The hash_file function currently reads the entire file into memory which will OOM for large checkpoints; change hash_file to stream the file in fixed-size chunks (e.g., 64KB) and call hashlib.sha256().update on each chunk in a loop while reading until EOF, then return hexdigest; keep the file opened in binary mode and preserve the existing function name hash_file to locate and replace the current implementation.experiments/verifiable_finetuning/train_base.py (2)
35-36:⚠️ Potential issue | 🟠 MajorDo not print unconditional “CORRECT MATCH”.
Line 36 can report success even when hashes differ.
Suggested fix
print(f"Base run hash: {ckpt_hash}") - print(f"Base run hash (again): {hash_file(ckpt_path)} CORRECT MATCH") + actual_hash = hash_file(ckpt_path) + status = "CORRECT MATCH" if actual_hash == ckpt_hash else "MISMATCH" + print(f"Base run hash (again): {actual_hash} {status}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@experiments/verifiable_finetuning/train_base.py` around lines 35 - 36, Remove the hardcoded "CORRECT MATCH" message and instead compare ckpt_hash to hash_file(ckpt_path): call hash_file(ckpt_path), compare the result to ckpt_hash, and print a single informative message that includes both hashes and a conditional "MATCH" or "MISMATCH" comment; update the two print statements around ckpt_hash, hash_file, and ckpt_path (the variables referenced) so the success text is only shown when the equality check passes.
31-33:⚠️ Potential issue | 🟠 Major
dataset_hashshould be an actual digest, not a label.Line 32 stores a static string, so the manifest does not cryptographically bind the exact dataset used in this run.
Suggested fix
+import hashlib import torch import torch.nn as nn from utils import get_path, hash_file, save_deterministic, set_seed, update_manifest @@ - update_manifest( - "base", {"seed": 99, "dataset_hash": "synthetic_16x10_seed99", "checkpoint_hash": ckpt_hash} - ) + dataset_bytes = X.detach().cpu().numpy().tobytes() + y.detach().cpu().numpy().tobytes() + dataset_hash = hashlib.sha256(dataset_bytes).hexdigest() + update_manifest("base", {"seed": 99, "dataset_hash": dataset_hash, "checkpoint_hash": ckpt_hash})🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@experiments/verifiable_finetuning/train_base.py` around lines 31 - 33, The manifest is being written with a static label instead of a cryptographic digest; replace the hardcoded "synthetic_16x10_seed99" in the update_manifest call with a real digest computed from the dataset used in this run (e.g., compute a SHA-256 of the dataset file(s) or a canonical representation) and pass that digest as the dataset_hash argument to update_manifest; locate the call to update_manifest in train_base.py and use the computed_digest variable (or a new helper like compute_dataset_hash(dataset_path)) so update_manifest("base", {"seed": 99, "dataset_hash": computed_digest, "checkpoint_hash": ckpt_hash}) records the actual dataset digest.experiments/verifiable_finetuning/finetune.py (1)
24-27: 🧹 Nitpick | 🔵 TrivialAvoid
.datamutation in parameter updates.Line 26 should use an in-place op directly on the tensor under
no_grad()instead of mutating.data.Suggested fix
with torch.no_grad(): for param in model.parameters(): - param.data += 0.001 + param.add_(0.001)In current PyTorch guidance, is direct .data mutation discouraged compared with in-place ops (e.g., param.add_) inside torch.no_grad()?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@experiments/verifiable_finetuning/finetune.py` around lines 24 - 27, The code mutates parameter tensors via param.data inside the torch.no_grad() block; replace the .data mutation with an in-place tensor operation (e.g., call param.add_ or param.mul_ as appropriate) so updates occur safely without using .data — locate the loop iterating model.parameters() in the torch.no_grad() context and change the param.data += 0.001 line to an in-place tensor op like param.add_(0.001).experiments/verifiable_finetuning/manifest.py (1)
15-15:⚠️ Potential issue | 🟡 MinorFix typo in verification banner.
Line 15 prints “Plipeline”; this should be “Pipeline”.
Suggested fix
- print(" End to End Plipeline Verification \n") + print(" End to End Pipeline Verification \n")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@experiments/verifiable_finetuning/manifest.py` at line 15, The printed banner string contains a typo: replace the text " End to End Plipeline Verification \n" in the print statement with " End to End Pipeline Verification \n" (i.e., fix "Plipeline" → "Pipeline") so the verification banner reads correctly; locate the print call that emits the banner and update the literal accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@experiments/verifiable_finetuning/finetune.py`:
- Around line 24-27: The code mutates parameter tensors via param.data inside
the torch.no_grad() block; replace the .data mutation with an in-place tensor
operation (e.g., call param.add_ or param.mul_ as appropriate) so updates occur
safely without using .data — locate the loop iterating model.parameters() in the
torch.no_grad() context and change the param.data += 0.001 line to an in-place
tensor op like param.add_(0.001).
In `@experiments/verifiable_finetuning/manifest.py`:
- Line 15: The printed banner string contains a typo: replace the text " End to
End Plipeline Verification \n" in the print statement with " End to End Pipeline
Verification \n" (i.e., fix "Plipeline" → "Pipeline") so the verification banner
reads correctly; locate the print call that emits the banner and update the
literal accordingly.
In `@experiments/verifiable_finetuning/train_base.py`:
- Around line 35-36: Remove the hardcoded "CORRECT MATCH" message and instead
compare ckpt_hash to hash_file(ckpt_path): call hash_file(ckpt_path), compare
the result to ckpt_hash, and print a single informative message that includes
both hashes and a conditional "MATCH" or "MISMATCH" comment; update the two
print statements around ckpt_hash, hash_file, and ckpt_path (the variables
referenced) so the success text is only shown when the equality check passes.
- Around line 31-33: The manifest is being written with a static label instead
of a cryptographic digest; replace the hardcoded "synthetic_16x10_seed99" in the
update_manifest call with a real digest computed from the dataset used in this
run (e.g., compute a SHA-256 of the dataset file(s) or a canonical
representation) and pass that digest as the dataset_hash argument to
update_manifest; locate the call to update_manifest in train_base.py and use the
computed_digest variable (or a new helper like
compute_dataset_hash(dataset_path)) so update_manifest("base", {"seed": 99,
"dataset_hash": computed_digest, "checkpoint_hash": ckpt_hash}) records the
actual dataset digest.
In `@experiments/verifiable_finetuning/utils.py`:
- Around line 15-19: The set_seed function currently calls
torch.use_deterministic_algorithms(True, warn_only=True) which only warns on
nondeterministic ops; change this to enforce strict determinism by calling
torch.use_deterministic_algorithms(True, warn_only=False) so PyTorch raises on
nondeterministic ops, and also make the function fully deterministic by seeding
CUDA (torch.cuda.manual_seed_all) and setting backend flags
(torch.backends.cudnn.deterministic = True and torch.backends.cudnn.benchmark =
False) in the set_seed implementation.
- Around line 29-32: The use of the internal kwarg
_use_new_zipfile_serialization in save_deterministic is unsafe across PyTorch
versions; update save_deterministic to detect support for that parameter (e.g.,
inspect.signature(torch.save) or checking torch.__version__ / attribute
presence) and call torch.save with _use_new_zipfile_serialization only when
supported, otherwise fall back to calling torch.save without that kwarg; ensure
the function still returns hash_file(path) and keep the function name
save_deterministic unchanged.
- Around line 21-26: The hash_file function currently reads the entire file into
memory which will OOM for large checkpoints; change hash_file to stream the file
in fixed-size chunks (e.g., 64KB) and call hashlib.sha256().update on each chunk
in a loop while reading until EOF, then return hexdigest; keep the file opened
in binary mode and preserve the existing function name hash_file to locate and
replace the current implementation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 429788a8-597b-4905-92ba-576b011cbba3
📒 Files selected for processing (4)
experiments/verifiable_finetuning/finetune.pyexperiments/verifiable_finetuning/manifest.pyexperiments/verifiable_finetuning/train_base.pyexperiments/verifiable_finetuning/utils.py
Proof of Concept: Verifiable Fine-tuning
Summary
Adds a minimal verifiable fine-tuning step on top of a deterministic training baseline.
What this PR does
Why this matters
This builds on earlier work validating determinism and checkpoint reproducibility, and extends it into a minimal verifiable training + fine-tuning pipeline.
It shows that:
Screenshots/Recordings:
Additional Notes:
This is a minimal implementation:
Future work:
Checklist
Summary by CodeRabbit
New Features
Documentation
Chores
Dependencies
Bug Fixes