Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Greptile SummaryThis PR removes Confidence Score: 4/5Safe to merge after addressing the nil-safety regression in HasRunningQBFTInstance One P2 finding: HasRunningQBFTInstance now directly dereferences RunningInstance.State.Decided without the nil-check that the old IsDecided() provided, introducing a potential panic in deserialization edge cases where Instance.State could be nil. All other changes are correct refactors or positive behavioural improvements (e.g., removing the silent nil-CurrentDuty marshal path in runner_state.go, correctly returning nil instead of an error when a pruned committee runner's timeout arrives). protocol/v2/ssv/runner/runner.go (HasRunningQBFTInstance nil-safety) Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[Slot Timer Fires] --> B[onTimeout callback]
B --> C{Queue initialised?}
C -- No --> D[Log Error/Debug and return]
C -- Yes --> E[Push TimeoutData msg to queue]
E --> F[Sequential message-processing loop]
F --> G{Message type?}
G -- SSVEventMsgType/Timeout --> H[OnTimeoutQBFT]
H --> I{hasDutyRunning?}
I -- No --> J[return nil - duty already ended]
I -- Yes --> K{height == currentDutySlot?}
K -- No --> L[return nil - stale timeout for old slot]
K -- Yes --> M[QBFTController.OnTimeout]
G -- Other --> N[ProcessPreConsensus / ProcessConsensus / ProcessPostConsensus]
Reviews (4): Last reviewed commit: "revert the removal of queue filter" | Re-trigger Greptile |
|
@greptile pls re-review |
|
@greptile could you re-review ? |
|
@greptile pls re-review |
Regarding that concern, the behavior is intentional - the func (b *BaseRunner) HasRunningQBFTInstance() bool {
// Note: RunningInstance.State cannot be nil for existing RunningInstance by construction.
return b.hasDutyRunning() && b.HasStartedQBFTInstance() && !b.State.RunningInstance.State.Decided
}I think
|
ljuba-ssv
left a comment
There was a problem hiding this comment.
Clean PR — the mutex removal is well-reasoned and the timeout-via-queue approach is solid. A few minor nits inline.
This PR removes the need for
BaseRunner.mtxthat complicated the code quite a bit,the only real reason to have it were
onTimeoutfunc(s) that would checkHasRunningDuty()and skip the issuance of corresponding timeout even if necessary (as per @momosh-ssv discovery) ... now we do this instead:HasRunningDuty()check in theonTimeoutfunc(s) themselves, we just issue the timeout event every time (it's pretty rare, so the overhead from the added events is negligible)StartQueueConsumerandConsumeQueuefuncs) with every other duty-related message/event, they are all processed sequentially (due to reading from same thread-safe queue, in the same for-loop)HasRunningDuty()check is performed during the actual processing of those timeout events (seecase types.Timeout:)Additional notes:
Related:
F-ssv-186F-ssv-054https://github.com/ssvlabs/ssv-node-board/issues/764