Skip to content

feat: Sindri proof executor#82

Closed
katiemckeon wants to merge 14 commits intoagglayer:mainfrom
Sindri-Labs:klm-sindri-service
Closed

feat: Sindri proof executor#82
katiemckeon wants to merge 14 commits intoagglayer:mainfrom
Sindri-Labs:klm-sindri-service

Conversation

@katiemckeon
Copy link

Description

This adds Sindri as a serverless, scalable GPU proving option for agglayer.

New feature (non-breaking change which adds functionality)

In order to evaluate this PR and use the SindriExecutor, you will need a Sindri account and an api key.

Deploying a build

The arguments for a proof on Sindri's platform differ from both local and Sp1 Network. Mainly, you will register/deploy an ELF in a separate preparation phase. Then when you want to request a proof, you will refer to that ELF by an identifier rather than passing the executable. Makefile.elf.toml has been edited to show how you can use the Sindri rust CLI as a means to upload the most recent ELF build. The following command will upload the most recent contents of crates/aggchain-proof-program to Sindri:

SINDRI_API_KEY=your-key-here cargo make ap-elf-sindri-upload

This will associate that ELF to your own personal account. We could discuss "public projects" which would make the same ELF available to any user of Sindri's API + Agglayer if you are interested.

Prover Config

A Sindri prover config has two familiar fields: proving-request-timeout and proving-timeout. The new fields are associated to mapping proof requests to a deployed ELF:

project-name = "pessimistic-proof"
project-tag = "latest"

The project-name describes what you decide to call your Sp1 program. In crates/aggchain-proof-program, the new file sindri.json calls it pessimistic-proof. Just make sure that the name field in your Sindri manifest corresponds to project-name in your Prover config.

The project-tag field provides version control. While this PR does not supply any tags in Makefile.elf.toml, we demonstrate how you could add that information. When no other tag is supplied, every ELF deployment will use the tag latest which will overwrite the previous deployment.

PR Checklist:

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (unclear if this applies to externally contributed PRs)
  • I have added or updated tests that comprehensively prove my change is effective or that my feature works (end-to-end tests were performed, but some tests of the SindriExecutor more naturally fit in sindri-rust)

PR Creator Note

This PR shows how to deploy the aggchain-proof-program ELF to Sindri, but I was not able to run an end-to-end test for that particular program. I believe aggkit-prover is still under construction - supplying SindriExecutor as the primary prover seemed to require more tweaks than I felt comfortable making

However, I was able to run a satisfying hodge-podge test by:

  1. Revising agglayer to pull my forked prover branch
  2. Deploying the ELF from agglayer pessimistic-proof-program to Sindri
  3. Running kurtosis-cdk with the local agglayer image (and appropriate template changes for a Sindri prover config)
  4. Running lxly from the e2e repo

Happy to share more details about how to spin that up

@katiemckeon katiemckeon requested a review from a team as a code owner March 3, 2025 23:25

[primary-prover.sindri-prover]
proving-request-timeout = "5m"
proving-timeout = "10m"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you are running tests, you may notice that the proving-timeout you supply in your config does not match the exact time the Sindri client times out (especially if you are aggressive and put 1 minute or less). This is expected. For efficiency reasons, the Sindri client will hold a connection for some time while awaiting a proof job to complete. (The timeout delay will not go above a few minutes.)

Comment on lines +4 to +6
"circuitType": "sp1",
"provingScheme": "plonk",
"sp1Version": "4.0.0",
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These three fields should be fixed while you pin to Sp1 v4.0.0, but the other fields (name and elfPath) are flexible. As mentioned in the PR description name is fully up to your team. The elfPath provides the relative path from this Sindri manifest to the compiled ELF.

Comment on lines +366 to +373
if proof_response.status == JobStatus::Failed {
return Err(Error::ProverFailed(
proof_response.error.flatten().unwrap_or(
"Sindri job was marked as failed. No error message was provided."
.to_string(),
),
));
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While errors related to authorization or some other configuration issue are caught by the map_err above, there may be cases where the Sindri proof request is made and a job is run successfully, but the result is not a valid Sp1 proof. (For instance if you map an input to the wrong ELF)

@katiemckeon katiemckeon changed the title Sindri proof executor feat: Sindri proof executor Mar 17, 2025
@katiemckeon
Copy link
Author

Closing in favor of #274

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant