Skip to content

Promote audit fixes F-23, F-31, F-32 to master#161

Merged
adequatelimited merged 6 commits intomasterfrom
audit-fixes
Apr 13, 2026
Merged

Promote audit fixes F-23, F-31, F-32 to master#161
adequatelimited merged 6 commits intomasterfrom
audit-fixes

Conversation

@adequatelimited
Copy link
Copy Markdown
Collaborator

Promotes three audit remediations from `audit-fixes` to `master`.

Contents

Test plan

  • Per-PR test plans executed (build clean, unit tests pass, live sync validation where applicable).
  • Full build clean under `-Wall -Werror -Wextra -Wpedantic -fopenmp`.
  • No consensus-path behavioral changes; all three are correctness fixes on error/race paths.

…1-record read

read_tfile() returns size_t but previously returned VERROR (==1) on
fopen() failure, which is indistinguishable from a successful read of
one trailer. Callers that check `!= 1` or `<= 0` silently accepted the
error as success and continued with uninitialized buffer data on the
consensus-critical paths (ng_val, contention, send_found).

Return 0 on error instead. All existing callers already check `!= 1`,
`<= 0`, or `!= NTFTX` which correctly detect 0.

Also wire up the pre-existing TODO in send_found() to check count !=
NTFTX and exit the forked child on incomplete read.

Unit test in src/test/tfile-read-error.c covers the three cases:
missing file, partial read, and seek-past-EOF.
…eturn

fix(tfile): read_tfile() returns 0 on error instead of 1 (F-23, #99)
get_hash() leaked np->sd on send_op() or recv_tx() failure: both early
returns skipped sock_close(). Under repeated peer probing failures
(common on unreachable or misbehaving peers) this exhausts the fd
limit and prevents new outbound connections.

Restructure with a single CLEANUP label so every exit path -- success,
I/O failure, and malformed response -- closes the socket before
returning. Matches the DROP_CLEANUP/ERROR_CLEANUP idiom used elsewhere
in the codebase.
fix(network): close socket on every exit path in get_hash() (F-31, #117)
The while(count && ecode == VEOK) condition at sync.c:138 reads the
shared count outside any OMP_CRITICAL_ section, while the two
decrements are inside critical sections. Per the OpenMP memory model,
the flush on critical-exit is not paired with an unsynchronized
reader, so the read can observe stale cached values (or be a data
race by C11; TSan will flag it).

Shadow the parameter with a local `volatile word32 count = count_in`.
volatile forces a memory read each iteration; combined with the
writers' implicit flush on critical-exit, this establishes the
needed happens-before. Same treatment applied to the analogous
ecode read in tfile.c under F-10.

Analyzed the underflow claim from the original issue and confirmed it
is not reachable -- each thread's loop body decrements at most once
before breaking. See github.com//issues/112 comment
for the trace.
fix(sync): close data race on catchup() count read (F-32, #112)
@adequatelimited adequatelimited merged commit c6a3f9b into master Apr 13, 2026
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant