-
Notifications
You must be signed in to change notification settings - Fork 247
test: adding compatibility test to perform syncing #3023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add EVMCompatTestSuite to test syncing between different ev-node versions. The test: - Starts sequencer with version A (env: SEQUENCER_EVM_IMAGE_*) - Submits transactions - Starts full node with version B (env: FULLNODE_EVM_IMAGE_*) - Verifies full node syncs and can query transactions - Submits post-sync transactions to verify ongoing sync Add make test-docker-compat target for running the test.
|
Claude finished @chatton's task in 2m 44s —— View job Code Review CompleteI've conducted a comprehensive review of this PR, which adds a valuable compatibility test to ensure cross-version syncing works correctly. Here are my findings: Tasks
SummaryThis PR introduces a robust Docker E2E compatibility test that verifies a newer full node can sync from an older sequencer. The implementation is generally solid, with good test structure and helpful utility refactoring. However, there are several areas that could be improved for production readiness. Critical Issues1. Test Flakiness Risk - Fixed Sleep (Priority: High)Location: The test uses a fixed 5-second sleep to wait for DA propagation: // wait for blocks to be posted to DA before starting full node
time.Sleep(5 * time.Second)Issue: Fixed sleeps introduce flakiness as timing can vary across environments (CI, local, different machine specs). This violates the CLAUDE.md guideline: "Ensure tests are deterministic." Recommendation: Replace with a conditional wait based on observable state (similar to Gemini's suggestion). Wait for additional blocks to be produced, which provides implicit confirmation that DA propagation has occurred: s.Run("wait_for_da_propagation", func() {
s.T().Log("Waiting for DA propagation by observing block production...")
initialHeight, err := s.sequencerClient.BlockNumber(ctx)
s.Require().NoError(err)
const blockMargin = 3
targetHeight := initialHeight + blockMargin
s.Require().Eventually(func() bool {
height, err := s.sequencerClient.BlockNumber(ctx)
return err == nil && height >= targetHeight
}, 30*time.Second, time.Second,
"sequencer did not produce %d blocks for DA propagation", blockMargin)
s.T().Logf("DA propagation confirmed via block production to height %d", targetHeight)
})2. Goroutine Leak Risk - No Context Cancellation (Priority: High)Location: Issue: The test uses Recommendation: Use a context with timeout: func (s *EVMCompatTestSuite) TestCrossVersionSync() {
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Minute)
defer cancel()
s.setupDockerEnvironment()
// ... rest of test
}3. Incomplete Error Context (Priority: Medium)Location: Issue: Several error assertions lack sufficient context for debugging failures: s.Require().NoError(err) // Line 120, 123, 158, 161Recommendation: Add descriptive error messages following CLAUDE.md: "Wrap errors with context using s.Require().NoError(err, "failed to build sequencer chain with image %s:%s",
image.Repository, image.Version)
s.Require().NoError(s.sequencerNode.Start(ctx),
"failed to start sequencer node")
s.Require().NoError(s.fullNode.WriteFile(ctx, "config/genesis.json", genesis),
"failed to write genesis.json to full node")Moderate Issues4. Race Condition in Transaction Nonce Management (Priority: Medium)Location: Issue: The Current code: tx := evm.GetRandomTransaction(s.T(), evmTestPrivateKey, evmTestToAddress,
evmTestChainID, evmTestGasLimit, &s.txNonce)Observation: Currently safe since subtests use Recommendation: Document this assumption or use atomic operations if parallel execution is planned: // submitTransactions submits count transactions sequentially.
// Note: Not safe for concurrent use due to shared nonce state.
func (s *EVMCompatTestSuite) submitTransactions(ctx context.Context, count int) []common.Hash {5. Missing Test Isolation (Priority: Medium)Location: Issue: The test suite doesn't clean up resources (Docker containers, volumes) in TearDown methods. If the test fails mid-execution, containers and volumes will persist, potentially interfering with subsequent test runs. Recommendation: Add cleanup: func (s *EVMCompatTestSuite) TearDownSuite() {
ctx := context.Background()
if s.sequencerNode != nil {
_ = s.sequencerNode.Remove(ctx)
}
if s.fullNode != nil {
_ = s.fullNode.Remove(ctx)
}
if s.sequencerClient != nil {
s.sequencerClient.Close()
}
if s.fullNodeClient != nil {
s.fullNodeClient.Close()
}
}6. TODO in Production Workflow (Priority: Medium)Location: SEQUENCER_EVM_IMAGE_TAG: main # TODO: set this to the last released tagIssue: Using Recommendation: Update to use the most recent release tag (appears to be SEQUENCER_EVM_IMAGE_TAG: v1.0.0-rc.1Code Quality & Architecture7. Excellent Refactoring ✅Location: Positive: The refactoring to extract shared EVM constants (
The rename from 8. Good Test Structure ✅Location: Positive: The test is well-organized using subtests with descriptive names: s.Run("setup_celestia_and_da", ...)
s.Run("setup_sequencer", ...)
s.Run("submit_transactions", ...)
s.Run("setup_fullnode_and_sync", ...)
s.Run("verify_sync", ...)This makes test output clear and enables targeted re-runs. Follows best practices from CLAUDE.md: "Use table-driven tests where appropriate." 9. P2P Address Construction is Fragile (Priority: Low)Location: Issue: The Observation: This is acceptable for tests, but consider whether ev-node has an API endpoint that returns the P2P multiaddr directly, which would be more maintainable. Recommendation: If a direct API exists, use it: // Check if ev-node exposes /net/info or similar endpoint
addr, err := s.sequencerNode.GetP2PAddress(ctx) // hypotheticalSecurity10. Hardcoded Test Credentials ✅ (Priority: Informational)Location: evmTestPrivateKey = "cece4f25ac74deb1468965160c7185e07dff413f23fcadb611b05ca37ab0a52e"Status: Acceptable for test-only code since it's clearly marked as test credentials and not used in production. This follows CLAUDE.md's guidance to avoid exposing private keys in production, but test fixtures are exempt. Performance & Resource Efficiency11. Efficient Test Parallelization Opportunity (Priority: Low)Location: Observation: The three Docker test jobs ( Recommendation: Consider adding a jobs:
docker-tests:
# ... existing config
docker-upgrade-tests:
needs: [] # Run in parallel with docker-tests
# ... existing config
test-docker-compat:
needs: [] # Run in parallel with other jobs
# ... existing configThis would reduce CI runtime but increase resource usage. Testing & Reliability12. Good Coverage of Sync Scenarios ✅Location: Positive: The test validates both historical sync (transactions before full node starts) AND ongoing sync (transactions after full node joins): s.Run("verify_sync", func() {
// Verify pre-sync transactions
s.verifyTransactionsOnFullNode(ctx, preSyncTxHashes)
// Submit more and verify ongoing sync
postSyncTxHashes := s.submitTransactions(ctx, 5)
latestHeight, err := s.sequencerClient.BlockNumber(ctx)
s.Require().NoError(err)
s.waitForSync(ctx, latestHeight)
s.verifyTransactionsOnFullNode(ctx, postSyncTxHashes)
})This comprehensive approach catches both historical state sync and live block processing issues. 13. Timeout Values Should Be Configurable (Priority: Low)Location: Multiple locations Observation: Timeouts are hardcoded throughout:
Recommendation: Consider extracting to constants or environment variables for CI environments with different performance characteristics: const (
defaultNodeStartTimeout = 60 * time.Second
defaultSyncTimeout = 120 * time.Second
defaultHealthCheckTimeout = 120 * time.Second
)Documentation & Developer Experience14. Missing Function Documentation (Priority: Low)Location: Multiple helper functions Issue: Several exported test helpers lack godoc comments:
Recommendation: Add documentation following CLAUDE.md: "Document exported types and functions": // SetupRethNode creates, starts, and waits for a Reth execution client.
// Returns RethSetupConfig containing URLs, JWT secret, and genesis hash.
func (s *DockerTestSuite) SetupRethNode(ctx context.Context, name string) RethSetupConfig {Dependency Updates Review15. Go Version Bump ✅Multiple go.mod files: Status: Good. Patch version updates typically include bug fixes and security patches. 16. Dependency Version Updates ✅Key updates in
Analysis:
Recommendation: Verify Docker SDK API compatibility if any test failures occur. Additional Observations17. Build Tags Are Correct ✅Location: //go:build docker_e2e && evmProperly restricts test to Docker E2E runs with EVM support, preventing accidental execution in unit test runs. 18. Image Configuration Pattern ✅Location: The Summary Recommendations by PriorityMust Fix (High Priority)
Should Fix (Medium Priority)
Nice to Have (Low Priority)
Overall AssessmentCode Quality: ⭐⭐⭐⭐ (4/5) - Well-structured, good refactoring, solid test coverage Production Readiness: ⭐⭐⭐ (3/5) - Needs fixes for test flakiness and resource cleanup Adherence to Guidelines: ⭐⭐⭐⭐ (4/5) - Follows most CLAUDE.md patterns, minor gaps in error handling Risk Level: Low-Medium - The main risks are test flakiness (fixed sleep) and resource leaks (no cleanup), but these don't affect production code. This is a solid PR that adds valuable compatibility testing infrastructure. The refactoring work is excellent and will benefit future test development. Addressing the high-priority items will make the tests more reliable and maintainable. |
|
The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).
|
Summary of ChangesHello @chatton, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the testing infrastructure by introducing a robust Docker E2E compatibility test. This new test ensures that different versions of the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a valuable compatibility test suite to ensure syncing works across different node versions. The new EVMCompatTestSuite effectively simulates a real-world scenario where a new full node syncs from a sequencer running a different version. The changes also include beneficial refactoring of the existing test suite, such as centralizing test constants and creating reusable helper functions for setting up test nodes. My review includes one suggestion to improve the robustness of the new test by replacing a fixed sleep with a conditional wait.
| time.Sleep(5 * time.Second) | ||
|
|
||
| sequencerHeight, err := s.sequencerClient.BlockNumber(ctx) | ||
| s.Require().NoError(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a fixed time.Sleep can introduce flakiness into tests, as the time required for an operation can vary between test runs. A more robust approach is to wait for a specific condition. Here, you're waiting for blocks to be posted to the DA. While directly querying the DA might be complex, you can achieve a more reliable wait by ensuring a few more blocks are produced by the sequencer. This provides a time window for DA propagation that is tied to block production rather than a fixed duration.
s.T().Log("Waiting for a few blocks to allow for DA propagation...")
lastTxHeight, err := s.sequencerClient.BlockNumber(ctx)
s.Require().NoError(err)
const daPropagationBlocks = 3
targetHeight := lastTxHeight + daPropagationBlocks
s.Require().Eventually(func() bool {
h, err := s.sequencerClient.BlockNumber(ctx)
return err == nil && h >= targetHeight
}, 15*time.Second, time.Second, "sequencer did not produce new blocks for DA propagation")
sequencerHeight, err := s.sequencerClient.BlockNumber(ctx)
s.Require().NoError(err)
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3023 +/- ##
==========================================
- Coverage 55.54% 55.51% -0.04%
==========================================
Files 116 116
Lines 11507 11507
==========================================
- Hits 6392 6388 -4
- Misses 4413 4417 +4
Partials 702 702
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Overview
This PR adds a compatibility test that will deploy a specified image, execute some transactions, then deploy the PR image and verify that syncing still works correctly.
closes #3019