Skip to content

fcmp++: Consolidate paths in RPC to get path by output ID#89

Open
j-berman wants to merge 7 commits intoseraphis-migration:fcmp++-stagefrom
j-berman:consolidated-paths
Open

fcmp++: Consolidate paths in RPC to get path by output ID#89
j-berman wants to merge 7 commits intoseraphis-migration:fcmp++-stagefrom
j-berman:consolidated-paths

Conversation

@j-berman
Copy link
Copy Markdown
Collaborator

Serialization guru, the ConsolidatedPaths struct is using serializable type helpers that are strictly used for serializing/de-serializing, and I wonder if there may be a reasonably simpler approach there

I also hardened the scan_tx tests to also test for spending immediately after scanning.

@j-berman j-berman mentioned this pull request Aug 21, 2025
75 tasks
@j-berman j-berman force-pushed the consolidated-paths branch from 1fe32c1 to eac9644 Compare August 21, 2025 17:38
@jeffro256
Copy link
Copy Markdown
Collaborator

Any metrics on how much this actually helps data size?

@j-berman
Copy link
Copy Markdown
Collaborator Author

Let's say you're requesting paths for the first 38 outputs in the tree, the tree has 6 layers, and the tree is 100% full.

Individual path = 32 bytes*(38*3+18*2+1) = 4832 bytes

Without consolidating, then the response is roughly 4832 bytes*38 = 183,616 bytes
With consolidating, then it's just 4832 bytes (since the first 38 outputs in the tree all share the same path)

So in the worst (best?) case, the response should be 38x smaller with this PR

@j-berman
Copy link
Copy Markdown
Collaborator Author

Presumably people will end up requesting paths for outputs that are somewhat close in the tree (people are probably more likely to spend closer together than further apart time-wise), in which case there's decent benefit from this PR. Also this endpoint may be used for decoy selection algos at some point, which would bias toward recent (and with "binning", would select outputs right next to each other too). I figure makes sense for the RPC to be optimized now since it's pretty simple to do, and would be a little annoying to do later (would probably just implement a totally new endpoint and then we're maintaining both).

virtual uint64_t get_path_by_global_output_id(const std::vector<uint64_t> &global_output_ids,
const uint64_t as_of_n_blocks,
std::vector<uint64_t> &leaf_idxs_out,
std::vector<fcmp_pp::curve_trees::AssignedLeafIdx> &leaf_idxs_out,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I'm slightly confused why this was changed? Aren't all of them going to be assigned or else the whole thing fails?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't all of them going to be assigned

If a user requests a coinbase output ID 15 blocks back from tip e.g., it won't be assigned

or else the whole thing fails

If the client in scan_tx requests an output not yet in the tree, it fails client-side. I figure it makes sense for the daemon to still be able to return a success response in this case though, since the endpoint is expected to have wider usage beyond just scan_tx.

END_KV_SERIALIZE_MAP()
};

std::vector<LeafChunk> leaves_by_chunk_idx_vec;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't these go inside BEGIN_KV_SERIALIZE_MAP?

Copy link
Copy Markdown
Collaborator Author

@j-berman j-berman Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the KV_SERIALIZE macros require them to be members of the parent struct as written :/

EDIT: moved in the latest, just called the function the macros are calling directly (I saw @vtnerd do this in p2p ssl)

std::vector<std::unordered_map<uint64_t, ChunkBytes>> layer_chunks_by_chunk_idx;

// Serializable helper types
struct LeafChunk
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These types can be private

std::vector<OutputContext> chunk;

BEGIN_KV_SERIALIZE_MAP()
KV_SERIALIZE(chunk_idx)
Copy link
Copy Markdown
Collaborator

@jeffro256 jeffro256 Sep 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These field names will be duplicated in the payload a lot, is that acceptable for a consolidated format?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a good spot. I de-duplicated in the latest, also de-duplicating the assigned leaf idxs.

Note: I can do the same/similar for init_tree_sync_data_t which may be worth it before v1 release

@jeffro256
Copy link
Copy Markdown
Collaborator

Needs rebase

@jeffro256
Copy link
Copy Markdown
Collaborator

I think that since this RPC endpoint is being touched, it would be a good time to bring up changing the request from taking the DB-style global indices to the amount-style global indices, since amount-style indices are already widely used in the ecosystem for wallet<->daemon RPC calls, most notably in get_blocks.bin, whereas the DB-style global indices aren't, AFAIK. As far as I can tell, the only place the DB-style ones are used is for this RPC call, and they are fetched by wallet2 using an RPC endpoint which also serves amount-style indices. @vtnerd mentioned an issue with requesting paths in the LWS because of the indexing system. I'm trying to implement FCMP++ spend proofs, which requires fetching the paths as of a certain block for certain outputs, for which wallet2 has the amount-style indices on the sender side, but not the DB-style indices.

@j-berman
Copy link
Copy Markdown
Collaborator Author

j-berman commented Dec 16, 2025

I disagree for the same reason as here. It makes more sense to move away from amount-style indices long term than to carry the debt forwards and harden it in a new RPC endpoint, since the amount-style indices are not actually useful optimal* for anything but legacy ring signatures. That was also my reasoning for using global indices in this endpoint.

On wallet side for spend proofs, the tree cache stores global indices for received outputs, so it is retrievable there.

@j-berman
Copy link
Copy Markdown
Collaborator Author

j-berman commented Dec 16, 2025

My bad, the tree cache doesn't actually save global indices long term either (only leaf idxs). Regardless, I think the better solution is either for the wallet to keep the global index saved too at some point, or to make the extra trip to the daemon, or implement a new endpoint that can take amount output ID's and then use this endpoint's logic internally (EDIT: or an endpoint to get global output ID's from amount output ID's). I think using global indices (not tethered to amount) is the more future-proof endpoint that has a stronger claim at v1 / long term maintenance.

@jeffro256
Copy link
Copy Markdown
Collaborator

On wallet side for spend proofs, the tree cache stores global indices for received outputs, so it is retrievable there.

Does the wallet not trigger the tree cache to drop outputs that have been spent?

I think using global indices (not tethered to amount) is the more future-proof endpoint that has a stronger claim at v1 / long term maintenance.

or an endpoint to get global output ID's from amount output ID's

I think that if we're going to really commit to this indexing scheme, then yes, there should be an RPC endpoint to convert index types.

@jeffro256
Copy link
Copy Markdown
Collaborator

The RPC endpoint to convert amount-style indices to DB-style indices should be very easy to implement with the current DB scheme, with a simple lookup in the output_amounts table and reading the outkey::output_id field.

@j-berman
Copy link
Copy Markdown
Collaborator Author

Does the wallet not trigger the tree cache to drop outputs that have been spent?

It doesn't. I don't think we need that for v1 but it would be nice to have

I think that if we're going to really commit to this indexing scheme, then yes, there should be an RPC endpoint to convert index types

I don't think it's fully necessary, it holds on to an index that should be deprecated. But if you feel it is/it's desired, then it seems worth a separate PR imo. I think my argument is sound for aiming to move forward with global output ID's

@jeffro256
Copy link
Copy Markdown
Collaborator

It doesn't. I don't think we need that for v1 but it would be nice to have

Well actually, maybe my exact use case (spend proofs) is a reason against dropping outputs after they're spent: it erases the privacy impact of requesting output paths from the daemon to create them. And the storage cost per output is only $O(log(N))$, where $N$ is the number of leaf tuples in the tree. There might be one problem though: a spend proof needs to get the historical path of output given its reference block number, so the tree cache would need to track the first reference block for each node where the node becomes available in the tree. Not only that, it would have to do that for each child of nodes we want to keep in the tree, which would otherwise have been pruned, so that it can recalculate nodes on the "edge" of the path at the desired reference block number. Gross...

It's easy for the daemon to calculate the path at any given reference block number because it knows A) all leaf tuples and B) the number of leaf tuples at each block. So maybe for v1 we don't even try to keep the historical path fetching local, and always rely on the daemon. To do that now, I can make the /get_transactions requests and use the returned global_output_ids in the response, then call /get_path_by_global_output_id.bin with those global IDs. I should take care to include all the global IDs in the TX, not just the ones we need for the proof for a sliver of privacy. This use case would definitely benefit from consolidated paths in the RPC.

@j-berman j-berman force-pushed the consolidated-paths branch from c7dfbc3 to 46fe81b Compare April 3, 2026 18:35
@vtnerd
Copy link
Copy Markdown

vtnerd commented Apr 3, 2026

#278 has a ZMQ endpoint for converting legacy output ids to unified ids. It also switches ZMQ get_blocks to unified indices after the hard-fork. The "catch" is whether wallets should immediately switchover after the hard-fork, or whether they should do it lazily when needed.

I'm in the process of rebasing + updating that PR btw.

@vtnerd
Copy link
Copy Markdown

vtnerd commented Apr 3, 2026

Also, if you use that ZMQ conversion endpoint, it effectively leaks your outputs to the daemon. I'm not sure how to do the conversion to unified in a privacy preserving way - I guess the wallet could ask for all values or something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants