fix(network,sync): eliminate OMP data races on shared working variables#152
Merged
adequatelimited merged 1 commit intoaudit-fixesfrom Apr 13, 2026
Merged
Conversation
scan_quorum() declared a single shared `int result` used for two
semantically different purposes in the same OMP parallel loop:
1. Weight comparison inside OMP_CRITICAL_() (writes/reads under lock)
2. Peer-contribution counter OUTSIDE any critical section (writes
and reads across threads with no synchronization)
The conflation produced a real data race between threads: Thread A in
the critical section writing result = cmp256(...) while Thread B
outside any critical section executing result = 0 / result++. TSan
flagged the access pattern empirically on a live mainnet scan.
Split the variable into two thread-private variables declared at
function top and added to the OMP private() clause:
- weight_cmp: used only in the critical section for cmp256 result
- contrib: used only in the peer-counting loop outside the critical
section
Style: matches the existing codebase convention for OpenMP thread-local
state (see `node`, `peer`, `len`, `ipstr` in the same function's
private() clause). Block-scope declarations inside the OMP region work
equally well semantically but the function-top + private() form is
what the rest of this file uses.
Also applied the same style to `peer` and `ecode` in catchup() (sync.c)
while in the neighborhood — both were declared inside the OMP parallel
block body in C99 style. Moved to function top + private() clause for
consistency. No behavior change in catchup(); purely convention cleanup.
Closes #110
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
int resultinscan_quorum()into two thread-private variables (weight_cmp,contrib) — eliminates a real data racepeerandecodeincatchup()to the same function-top +private()clause style, matching the codebase conventionProblem
scan_quorum()used a single sharedresultvariable for two semantically different purposes — weight comparison insideOMP_CRITICAL_()and peer-contribution counting outside any critical section. Thread A writingresult = cmp256(...)inside the critical section while Thread B wroteresult = 0/result++outside produced a genuine C11 data race.Empirical evidence
Ran a TSan-instrumented build against a live mainnet scan:
resultresult/weight_cmp/contribRemaining reports are separate issues (scanidx/netplistidx/Rplistidx shared-state, peer.c
shuffle32/addpeer,resync()at sync.c:212) — out of scope for this PR, could be new audit findings.Changes
src/network.c—scan_quorum()int result;at function topint weight_cmp; int contrib;private()clauseresultwith the appropriate per-purpose variablesrc/sync.c—catchup()word32 peer;andint ecode;from inside-the-OMP-region declarations to function topprivate()clausepeer = plist[OMP_THREADNUM],ecode = VEOK) now happen at the top of the parallel region body since OMPprivate()doesn't initializeTest plan
make(mochimo binary) passesweight_cmp,contrib,peer(catchup), orecode(catchup)Closes #110