refactor: reduce needs on cs_main in network processing#6990
refactor: reduce needs on cs_main in network processing#6990PastaPastaPasta wants to merge 2 commits intodashpay:developfrom
Conversation
|
Important Review skippedReview was skipped as selected files did not have any reviewable changes. 💤 Files selected but had no reviewable changes (1)
You can disable this status message by setting the ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ No Merge Conflicts DetectedThis PR currently has no conflicts with other open PRs. |
7030342 to
3a8b312
Compare
|
This pull request has conflicts, please rebase. |
12f5253 to
96ff7be
Compare
| //! Length of current-streak of unconnecting headers announcements | ||
| int nUnconnectingHeaders{0}; | ||
| int nUnconnectingHeaders GUARDED_BY(m_mutex){0}; |
There was a problem hiding this comment.
can it be atomic?
also fSyncStarted; nBlocksInFlight, fPreferredDownload, fPreferHeaders', fPreferHeadersCmpressed?
There was a problem hiding this comment.
Maybe, but need to be more careful that full operation is atomic. I figured just using a mutex currently would be simplier
|
This pull request has conflicts, please rebase. |
|
✅ Review complete (commit 96ff7be) |
thepastaclaw
left a comment
There was a problem hiding this comment.
Code Review
I did not validate a concrete correctness regression in src/net_processing.cpp on this SHA, but this is still a large concurrency-sensitive refactor of peer-state and object-request locking with no added regression coverage. The main reviewable concern is merge risk from behavioral drift in lifecycle/accounting paths that now depend on the new per-node and object-request mutexes.
Reviewed commit: 96ff7be
🟡 1 suggestion(s)
🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.
In `src/net_processing.cpp`:
- [SUGGESTION] lines 439-589: This lock-splitting refactor still lacks targeted regression coverage
This patch moves `CNodeState` off `cs_main`, introduces `shared_ptr`-managed node-state lifetime plus a per-node mutex, and splits object-request coordination onto `g_object_request_mutex`, but it adds no unit or functional tests. Because the touched code governs header sync state, in-flight block/object accounting, eviction, and disconnect cleanup, the main merge risk here is silent behavioral drift rather than an obvious compile failure. Adding at least one focused regression test around disconnect cleanup / request bookkeeping (or a deterministic unit test for the moved counters and request/erase flow) would make this refactor materially safer to merge.
| @@ -568,17 +572,22 @@ struct CNodeState { | |||
| std::chrono::microseconds m_check_expiry_timer{0}; | |||
| }; | |||
|
|
|||
| ObjectDownloadState m_object_download; | |||
| ObjectDownloadState m_object_download GUARDED_BY(m_mutex); | |||
|
|
|||
| //! Whether this peer is an inbound connection | |||
| const bool m_is_inbound; | |||
|
|
|||
| //! A rolling bloom filter of all announced tx CInvs to this peer. | |||
| CRollingBloomFilter m_recently_announced_invs = CRollingBloomFilter{INVENTORY_MAX_RECENT_RELAY, 0.000001}; | |||
| CRollingBloomFilter m_recently_announced_invs GUARDED_BY(m_mutex){INVENTORY_MAX_RECENT_RELAY, 0.000001}; | |||
|
|
|||
| CNodeState(bool is_inbound) : m_is_inbound(is_inbound) {} | |||
| }; | |||
|
|
|||
| // Keeps track of the time (in microseconds) when transactions were requested last time | |||
| unordered_limitedmap<uint256, std::chrono::microseconds, StaticSaltedHasher> g_already_asked_for(MAX_INV_SZ, MAX_INV_SZ * 2); | |||
| unordered_limitedmap<uint256, std::chrono::microseconds, StaticSaltedHasher> g_erased_object_requests(MAX_INV_SZ, MAX_INV_SZ * 2); | |||
| Mutex g_object_request_mutex; | |||
There was a problem hiding this comment.
🟡 Suggestion: This lock-splitting refactor still lacks targeted regression coverage
This patch moves CNodeState off cs_main, introduces shared_ptr-managed node-state lifetime plus a per-node mutex, and splits object-request coordination onto g_object_request_mutex, but it adds no unit or functional tests. Because the touched code governs header sync state, in-flight block/object accounting, eviction, and disconnect cleanup, the main merge risk here is silent behavioral drift rather than an obvious compile failure. Adding at least one focused regression test around disconnect cleanup / request bookkeeping (or a deterministic unit test for the moved counters and request/erase flow) would make this refactor materially safer to merge.
source: ['codex-general', 'coordinator-manual']
🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.
In `src/net_processing.cpp`:
- [SUGGESTION] lines 439-589: This lock-splitting refactor still lacks targeted regression coverage
This patch moves `CNodeState` off `cs_main`, introduces `shared_ptr`-managed node-state lifetime plus a per-node mutex, and splits object-request coordination onto `g_object_request_mutex`, but it adds no unit or functional tests. Because the touched code governs header sync state, in-flight block/object accounting, eviction, and disconnect cleanup, the main merge risk here is silent behavioral drift rather than an obvious compile failure. Adding at least one focused regression test around disconnect cleanup / request bookkeeping (or a deterministic unit test for the moved counters and request/erase flow) would make this refactor materially safer to merge.
Issue being fixed or feature implemented
see: #6953 (comment) -- based on that collected data; the cs_main contention in network processing is ~17% of cs_main contention. If we exclude the islock cs_main (resolved (or mostly resolved don't recall) in that PR) the percentage grows to 30%; if we can minimize contention here, it will benefit our overall cs_main contention data.
What was done?
convert CConman to a structure that is more in line with peerman; namely, objects protect their own internal data, and the vector stores shared pointers, so that we take a shared_ptr from the vector, and then can use it as we desire without lifetime concerns.
Be aware, I would like to upstream these changes, so that we don't have to maintain them long term, however there's no guarantee that will be viable.
How Has This Been Tested?
Deployed testing was minimal; after 6953 is merged; I'd like to deploy this version to some testnet nodes and analyze contention.
Breaking Changes
None
Checklist:
Go over all the following points, and put an
xin all the boxes that apply.