Show sync worker info by jackzhhuang · Pull Request #4823 · starcoinorg/starcoin

jackzhhuang · 2026-02-26T07:08:43Z

Pull request type

Please check the type of change your PR introduces:

What is the current behavior?

Issue Number: N/A

What is the new behavior?

Other information

Summary by CodeRabbit

New Features

Parallel block synchronization now includes dedicated worker tracking and statistics collection.
Sync progress reports have been enhanced with detailed parallel metrics, including per-worker block counts and aggregate synchronization statistics.
Improved monitoring of parallel sync operations with comprehensive worker lifecycle management and state reporting.

coderabbitai · 2026-02-26T07:09:19Z

📝 Walkthrough

Walkthrough

This PR introduces a ParallelInfoService actor for tracking parallel sync worker activity and statistics. The service is registered at node startup, integrated into the sync pipeline, and exposes aggregated per-worker and total block synchronization counts via the sync API and progress reporting.

Changes

Cohort / File(s)	Summary
Node Initialization `node/src/node.rs`	Registered `ParallelInfoService` in the node startup sequence before `SyncService` and dependent components.
API Data Structures `sync/api/src/lib.rs`	Added `ParallelWorkerSyncStat` and `ParallelSyncStat` structs and extended `SyncProgressReport` with optional `parallel` field to expose aggregated sync statistics.
Parallel Service Module `sync/src/parallel/mod.rs`, `sync/src/parallel/parallel_info_service.rs`	Introduced `ParallelInfoService` actor with snapshot generation, worker registration/unregistration, block sync reporting, and statistics retrieval; implemented async trait `ParallelInfoAsyncService`.
Worker Lifecycle Management `sync/src/parallel/sender.rs`	Enhanced `DagBlockWorker` and `DagBlockSender` with worker ID allocation, registration tracking, worker state reporting to `ParallelInfoService`, and ID recycling pool management.
Sync Task Integration `sync/src/sync.rs`, `sync/src/tasks/mod.rs`, `sync/src/tasks/block_sync_task.rs`, `sync/src/tasks/inner_sync_task.rs`	Threaded `ParallelInfoService` through the full sync task pipeline, added `full_sync_task_with_parallel_info` function, updated progress reporting to include parallel statistics, and integrated service into worker lifecycle events.
Test Updates `sync/src/parallel/tests.rs`	Updated constructor call site to include new `parallel_info_service` parameter in `DagBlockSender::new`.

Sequence Diagram

sequenceDiagram
    participant Node
    participant PIS as ParallelInfoService
    participant DBS as DagBlockSender
    participant API as SyncAPI
    
    Node->>PIS: Initialize & Register Service
    activate PIS
    
    DBS->>PIS: register_worker(worker_id)
    note over PIS: Track: worker_id → synced_blocks=0
    
    DBS->>DBS: Execute block via worker
    
    DBS->>PIS: report_worker_synced_block(worker_id)
    note over PIS: Increment synced_blocks for worker<br/>Generate snapshot with totals
    
    API->>PIS: get_parallel_sync_stat()
    activate PIS
    PIS-->>API: ParallelSyncStat {workers, total_count}
    deactivate PIS
    
    API-->>Node: SyncProgressReport {parallel: Some(...)}
    
    DBS->>PIS: unregister_worker(worker_id)
    note over PIS: Cleanup worker entry

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Possibly related PRs

Synchronize optimization into parallel processing, and replace the calculation of ghost data with validation methods during synchronization to improve efficiency #4192: Adds and modifies the sync parallel subsystem including sync/src/parallel modules with parallel sync metadata and services integration.
Sync optimization #4794: Modifies parallel worker machinery in sync/src/parallel/sender.rs with DagBlockSender/DagBlockWorker constructor and lifecycle changes.

Suggested labels

ci-runner: jack

Suggested reviewers

sanlee42
jolestar
nkysg

Poem

🐰 In sync we dance, with workers fleet,
Each block reported, the flow's so neat,
Stats gathered wide, from here to there,
Parallel progress, tracked with care! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 10.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Show sync worker info' is clear and directly related to the main changes, which add a ParallelInfoService and infrastructure to track and expose per-worker synchronization statistics.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch sync-tool

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f4d3c73f3e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-02-26T07:13:49Z

sync/src/parallel/parallel_info_service.rs

+        let synced_block_count = self.worker_synced_blocks.entry(msg.worker_id).or_insert(0);
+        *synced_block_count = synced_block_count.saturating_add(1);


Reject sync reports from unknown worker IDs

ReportWorkerSyncedBlockRequest currently creates a new entry when the worker ID is missing, which makes the stats vulnerable to out-of-order actor messages: worker IDs are reused from 0 in each DagBlockSender instance (sync/src/parallel/sender.rs, next_worker_id: 0), and senders are recreated repeatedly during sync (sync/src/tasks/block_sync_task.rs, DagBlockSender::new(...)), so a delayed report from an old sender can resurrect or inflate a worker in a new cycle and corrupt parallel progress output.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-02-26T07:13:49Z

sync/src/parallel/sender.rs

+                            Self::report_worker_synced_block(
+                                &self.parallel_info_service,
+                                worker.worker_id,
+                            );
                            self.notifier.notify(*executed_block)?;


Count blocks only after notify succeeds

In wait_for_finish, the worker stat is incremented before self.notifier.notify(...) is executed; if notify returns an error (for example storage/connect failure), sync aborts but the block is already counted as synced, so the reported worker totals become inaccurate for failure paths. This differs from flush_executor_state, which reports only after successful notify.

Useful? React with 👍 / 👎.

coderabbitai

🧹 Nitpick comments (2)

sync/api/src/lib.rs (1)

137-141: Change worker_count to a fixed-width integer type for stable wire-level behavior.

worker_count: usize is platform-width dependent in this serialized public API type. Use u64 for consistent schema and serialization across 32-bit and 64-bit builds.

Update the assignment in sync/src/parallel/parallel_info_service.rs:33 to cast workers.len() appropriately (e.g., worker_count: workers.len() as u64).
Suggested change
 #[derive(Debug, Clone, Default, Serialize, Deserialize, JsonSchema, PartialEq, Eq)]
 pub struct ParallelSyncStat {
-    pub worker_count: usize,
+    pub worker_count: u64,
     pub total_synced_block_count: u64,
     pub workers: Vec<ParallelWorkerSyncStat>,
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@sync/api/src/lib.rs` around lines 137 - 141, Change the field type of
ParallelSyncStat::worker_count from usize to u64 to ensure stable wire-level
serialization; update the struct declaration for ParallelSyncStat (which
contains worker_count, total_synced_block_count, and workers:
Vec<ParallelWorkerSyncStat>) and then update the place where the struct is
constructed in the parallel info service (the code that sets worker_count from
workers.len()) to cast the length to u64 (e.g., worker_count: workers.len() as
u64).

sync/src/parallel/sender.rs (1)

355-379: Consider extracting shared cleanup logic.

The abort_workers method and Drop implementation have identical cleanup logic. While the duplication is safe (due to the registered guard in mark_worker_closed), consider extracting to a shared helper method for maintainability.

♻️ Suggested refactor to reduce duplication

+    fn cleanup_worker(
+        parallel_info_service: &Option<ServiceRef<ParallelInfoService>>,
+        free_worker_ids: &mut BinaryHeap<Reverse<ParallelWorkerId>>,
+        worker: &mut DagBlockWorker,
+    ) {
+        let _ = worker.sender_to_executor.try_send(None);
+        worker.handle.abort();
+        Self::mark_worker_closed(parallel_info_service, free_worker_ids, worker);
+    }
+
     fn abort_workers(&mut self) {
         for worker in &mut self.executors {
-            let _ = worker.sender_to_executor.try_send(None);
-            worker.handle.abort();
-            Self::mark_worker_closed(
+            Self::cleanup_worker(
                 &self.parallel_info_service,
                 &mut self.free_worker_ids,
                 worker,
             );
         }
     }
 }
 
 impl Drop for DagBlockSender<'_> {
     fn drop(&mut self) {
         for worker in &mut self.executors {
-            let _ = worker.sender_to_executor.try_send(None);
-            worker.handle.abort();
-            Self::mark_worker_closed(
+            Self::cleanup_worker(
                 &self.parallel_info_service,
                 &mut self.free_worker_ids,
                 worker,
             );
         }
     }
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@sync/src/parallel/sender.rs` around lines 355 - 379, The cleanup loop in
abort_workers and in the Drop impl for DagBlockSender is duplicated; extract
that loop into a private helper method (e.g., fn cleanup_executors(&mut self))
that iterates self.executors and performs the same steps
(sender_to_executor.try_send(None), worker.handle.abort(), and calling
Self::mark_worker_closed(&self.parallel_info_service, &mut self.free_worker_ids,
worker)), and then call this helper from both abort_workers and Drop to remove
duplication while preserving behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@sync/api/src/lib.rs`:
- Around line 137-141: Change the field type of ParallelSyncStat::worker_count
from usize to u64 to ensure stable wire-level serialization; update the struct
declaration for ParallelSyncStat (which contains worker_count,
total_synced_block_count, and workers: Vec<ParallelWorkerSyncStat>) and then
update the place where the struct is constructed in the parallel info service
(the code that sets worker_count from workers.len()) to cast the length to u64
(e.g., worker_count: workers.len() as u64).

In `@sync/src/parallel/sender.rs`:
- Around line 355-379: The cleanup loop in abort_workers and in the Drop impl
for DagBlockSender is duplicated; extract that loop into a private helper method
(e.g., fn cleanup_executors(&mut self)) that iterates self.executors and
performs the same steps (sender_to_executor.try_send(None),
worker.handle.abort(), and calling
Self::mark_worker_closed(&self.parallel_info_service, &mut self.free_worker_ids,
worker)), and then call this helper from both abort_workers and Drop to remove
duplication while preserving behavior.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b5fcd84 and f4d3c73.

📒 Files selected for processing (10)

node/src/node.rs
sync/api/src/lib.rs
sync/src/parallel/mod.rs
sync/src/parallel/parallel_info_service.rs
sync/src/parallel/sender.rs
sync/src/parallel/tests.rs
sync/src/sync.rs
sync/src/tasks/block_sync_task.rs
sync/src/tasks/inner_sync_task.rs
sync/src/tasks/mod.rs

jackzhhuang added 5 commits February 26, 2026 14:42

add parallel info serivce

96d3f48

init system service

8f5f60d

init the parallel sync info

ef412ed

add sync for dag info

c58e17f

add info

f4d3c73

jackzhhuang requested a review from sanlee42 as a code owner February 26, 2026 07:08

chatgpt-codex-connector bot reviewed Feb 26, 2026

View reviewed changes

coderabbitai bot reviewed Feb 26, 2026

View reviewed changes

sanlee42 approved these changes Feb 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Show sync worker info#4823

Show sync worker info#4823
jackzhhuang wants to merge 5 commits intodual-verse-dagfrom
sync-tool

jackzhhuang commented Feb 26, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 26, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Feb 26, 2026

Uh oh!

chatgpt-codex-connector bot Feb 26, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		let synced_block_count = self.worker_synced_blocks.entry(msg.worker_id).or_insert(0);
		*synced_block_count = synced_block_count.saturating_add(1);

Conversation

jackzhhuang commented Feb 26, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull request type

What is the current behavior?

What is the new behavior?

Other information

Summary by CodeRabbit

New Features

Uh oh!

coderabbitai bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jackzhhuang commented Feb 26, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 26, 2026 •

edited

Loading