fcmp++: Consolidate paths in RPC to get path by output ID#89
fcmp++: Consolidate paths in RPC to get path by output ID#89j-berman wants to merge 7 commits intoseraphis-migration:fcmp++-stagefrom
Conversation
1fe32c1 to
eac9644
Compare
|
Any metrics on how much this actually helps data size? |
|
Let's say you're requesting paths for the first 38 outputs in the tree, the tree has 6 layers, and the tree is 100% full. Individual path = 32 bytes*(38*3+18*2+1) = 4832 bytes Without consolidating, then the response is roughly 4832 bytes*38 = 183,616 bytes So in the worst (best?) case, the response should be 38x smaller with this PR |
|
Presumably people will end up requesting paths for outputs that are somewhat close in the tree (people are probably more likely to spend closer together than further apart time-wise), in which case there's decent benefit from this PR. Also this endpoint may be used for decoy selection algos at some point, which would bias toward recent (and with "binning", would select outputs right next to each other too). I figure makes sense for the RPC to be optimized now since it's pretty simple to do, and would be a little annoying to do later (would probably just implement a totally new endpoint and then we're maintaining both). |
src/blockchain_db/blockchain_db.h
Outdated
| virtual uint64_t get_path_by_global_output_id(const std::vector<uint64_t> &global_output_ids, | ||
| const uint64_t as_of_n_blocks, | ||
| std::vector<uint64_t> &leaf_idxs_out, | ||
| std::vector<fcmp_pp::curve_trees::AssignedLeafIdx> &leaf_idxs_out, |
There was a problem hiding this comment.
Sorry I'm slightly confused why this was changed? Aren't all of them going to be assigned or else the whole thing fails?
There was a problem hiding this comment.
Aren't all of them going to be assigned
If a user requests a coinbase output ID 15 blocks back from tip e.g., it won't be assigned
or else the whole thing fails
If the client in scan_tx requests an output not yet in the tree, it fails client-side. I figure it makes sense for the daemon to still be able to return a success response in this case though, since the endpoint is expected to have wider usage beyond just scan_tx.
c20ec30 to
a579b72
Compare
a579b72 to
64190fe
Compare
eac9644 to
4277705
Compare
src/fcmp_pp/curve_trees.h
Outdated
| END_KV_SERIALIZE_MAP() | ||
| }; | ||
|
|
||
| std::vector<LeafChunk> leaves_by_chunk_idx_vec; |
There was a problem hiding this comment.
Why don't these go inside BEGIN_KV_SERIALIZE_MAP?
There was a problem hiding this comment.
Because the KV_SERIALIZE macros require them to be members of the parent struct as written :/
EDIT: moved in the latest, just called the function the macros are calling directly (I saw @vtnerd do this in p2p ssl)
src/fcmp_pp/curve_trees.h
Outdated
| std::vector<std::unordered_map<uint64_t, ChunkBytes>> layer_chunks_by_chunk_idx; | ||
|
|
||
| // Serializable helper types | ||
| struct LeafChunk |
There was a problem hiding this comment.
These types can be private
src/fcmp_pp/curve_trees.h
Outdated
| std::vector<OutputContext> chunk; | ||
|
|
||
| BEGIN_KV_SERIALIZE_MAP() | ||
| KV_SERIALIZE(chunk_idx) |
There was a problem hiding this comment.
These field names will be duplicated in the payload a lot, is that acceptable for a consolidated format?
There was a problem hiding this comment.
This was a good spot. I de-duplicated in the latest, also de-duplicating the assigned leaf idxs.
Note: I can do the same/similar for init_tree_sync_data_t which may be worth it before v1 release
29f12ff to
7bfffa9
Compare
|
Needs rebase |
612c85c to
c7dfbc3
Compare
|
I think that since this RPC endpoint is being touched, it would be a good time to bring up changing the request from taking the DB-style global indices to the amount-style global indices, since amount-style indices are already widely used in the ecosystem for wallet<->daemon RPC calls, most notably in |
|
I disagree for the same reason as here. It makes more sense to move away from amount-style indices long term than to carry the debt forwards and harden it in a new RPC endpoint, since the amount-style indices are not actually On wallet side for spend proofs, the tree cache stores global indices for received outputs, so it is retrievable there. |
|
My bad, the tree cache doesn't actually save global indices long term either (only leaf idxs). Regardless, I think the better solution is either for the wallet to keep the global index saved too at some point, or to make the extra trip to the daemon, or implement a new endpoint that can take amount output ID's and then use this endpoint's logic internally (EDIT: or an endpoint to get global output ID's from amount output ID's). I think using global indices (not tethered to amount) is the more future-proof endpoint that has a stronger claim at v1 / long term maintenance. |
Does the wallet not trigger the tree cache to drop outputs that have been spent?
I think that if we're going to really commit to this indexing scheme, then yes, there should be an RPC endpoint to convert index types. |
|
The RPC endpoint to convert amount-style indices to DB-style indices should be very easy to implement with the current DB scheme, with a simple lookup in the |
It doesn't. I don't think we need that for v1 but it would be nice to have
I don't think it's fully necessary, it holds on to an index that should be deprecated. But if you feel it is/it's desired, then it seems worth a separate PR imo. I think my argument is sound for aiming to move forward with global output ID's |
Well actually, maybe my exact use case (spend proofs) is a reason against dropping outputs after they're spent: it erases the privacy impact of requesting output paths from the daemon to create them. And the storage cost per output is only It's easy for the daemon to calculate the path at any given reference block number because it knows A) all leaf tuples and B) the number of leaf tuples at each block. So maybe for v1 we don't even try to keep the historical path fetching local, and always rely on the daemon. To do that now, I can make the |
c7dfbc3 to
46fe81b
Compare
|
#278 has a ZMQ endpoint for converting legacy output ids to unified ids. It also switches ZMQ get_blocks to unified indices after the hard-fork. The "catch" is whether wallets should immediately switchover after the hard-fork, or whether they should do it lazily when needed. I'm in the process of rebasing + updating that PR btw. |
|
Also, if you use that ZMQ conversion endpoint, it effectively leaks your outputs to the daemon. I'm not sure how to do the conversion to unified in a privacy preserving way - I guess the wallet could ask for all values or something. |
Serialization guru, the
ConsolidatedPathsstruct is using serializable type helpers that are strictly used for serializing/de-serializing, and I wonder if there may be a reasonably simpler approach thereI also hardened the
scan_txtests to also test for spending immediately after scanning.