test: adding compatibility test to perform syncing #3023

chatton · 2026-01-28T10:20:19Z

Overview

This PR adds a compatibility test that will deploy a specified image, execute some transactions, then deploy the PR image and verify that syncing still works correctly.

closes #3019

Add EVMCompatTestSuite to test syncing between different ev-node versions. The test: - Starts sequencer with version A (env: SEQUENCER_EVM_IMAGE_*) - Submits transactions - Starts full node with version B (env: FULLNODE_EVM_IMAGE_*) - Verifies full node syncs and can query transactions - Submits post-sync transactions to verify ongoing sync Add make test-docker-compat target for running the test.

…ther

claude · 2026-01-28T10:20:41Z

Claude finished @chatton's task in 2m 44s —— View job

Code Review Complete

I've conducted a comprehensive review of this PR, which adds a valuable compatibility test to ensure cross-version syncing works correctly. Here are my findings:

Tasks

Review repository guidelines (CLAUDE.md)
Review new compatibility test suite
Review Docker E2E utility refactoring
Check dependency updates
Assess error handling and resilience patterns
Review test quality and coverage
Provide comprehensive feedback

Summary

This PR introduces a robust Docker E2E compatibility test that verifies a newer full node can sync from an older sequencer. The implementation is generally solid, with good test structure and helpful utility refactoring. However, there are several areas that could be improved for production readiness.

Critical Issues

1. Test Flakiness Risk - Fixed Sleep (Priority: High)

Location: test/docker-e2e/compat_test.go:68

The test uses a fixed 5-second sleep to wait for DA propagation:

// wait for blocks to be posted to DA before starting full node
time.Sleep(5 * time.Second)

Issue: Fixed sleeps introduce flakiness as timing can vary across environments (CI, local, different machine specs). This violates the CLAUDE.md guideline: "Ensure tests are deterministic."

Recommendation: Replace with a conditional wait based on observable state (similar to Gemini's suggestion). Wait for additional blocks to be produced, which provides implicit confirmation that DA propagation has occurred:

s.Run("wait_for_da_propagation", func() {
    s.T().Log("Waiting for DA propagation by observing block production...")
    initialHeight, err := s.sequencerClient.BlockNumber(ctx)
    s.Require().NoError(err)
    
    const blockMargin = 3
    targetHeight := initialHeight + blockMargin
    
    s.Require().Eventually(func() bool {
        height, err := s.sequencerClient.BlockNumber(ctx)
        return err == nil && height >= targetHeight
    }, 30*time.Second, time.Second, 
        "sequencer did not produce %d blocks for DA propagation", blockMargin)
    
    s.T().Logf("DA propagation confirmed via block production to height %d", targetHeight)
})

Fix this →

2. Goroutine Leak Risk - No Context Cancellation (Priority: High)

Location: test/docker-e2e/compat_test.go:45-92

Issue: The test uses context.Background() without any cancellation or timeout mechanism. If the test hangs (network issues, DA unavailability, container problems), goroutines and resources will leak. This violates CLAUDE.md: "Be mindful of goroutine leaks."

Recommendation: Use a context with timeout:

func (s *EVMCompatTestSuite) TestCrossVersionSync() {
    ctx, cancel := context.WithTimeout(context.Background(), 10*time.Minute)
    defer cancel()
    
    s.setupDockerEnvironment()
    // ... rest of test
}

Fix this →

3. Incomplete Error Context (Priority: Medium)

Location: test/docker-e2e/compat_test.go:94-165

Issue: Several error assertions lack sufficient context for debugging failures:

s.Require().NoError(err)  // Line 120, 123, 158, 161

Recommendation: Add descriptive error messages following CLAUDE.md: "Wrap errors with context using fmt.Errorf":

s.Require().NoError(err, "failed to build sequencer chain with image %s:%s", 
    image.Repository, image.Version)
s.Require().NoError(s.sequencerNode.Start(ctx), 
    "failed to start sequencer node")
s.Require().NoError(s.fullNode.WriteFile(ctx, "config/genesis.json", genesis),
    "failed to write genesis.json to full node")

Fix this →

Moderate Issues

4. Race Condition in Transaction Nonce Management (Priority: Medium)

Location: test/docker-e2e/compat_test.go:191-202

Issue: The submitTransactions function mutates s.txNonce which is shared state. If tests are run concurrently (via t.Parallel()), this could cause nonce collisions.

Current code:

tx := evm.GetRandomTransaction(s.T(), evmTestPrivateKey, evmTestToAddress, 
    evmTestChainID, evmTestGasLimit, &s.txNonce)

Observation: Currently safe since subtests use s.Run() (sequential) not t.Run() (parallel). However, CLAUDE.md emphasizes: "Be careful with concurrent access to shared state."

Recommendation: Document this assumption or use atomic operations if parallel execution is planned:

// submitTransactions submits count transactions sequentially.
// Note: Not safe for concurrent use due to shared nonce state.
func (s *EVMCompatTestSuite) submitTransactions(ctx context.Context, count int) []common.Hash {

5. Missing Test Isolation (Priority: Medium)

Location: test/docker-e2e/compat_test.go:38-43

Issue: The test suite doesn't clean up resources (Docker containers, volumes) in TearDown methods. If the test fails mid-execution, containers and volumes will persist, potentially interfering with subsequent test runs.

Recommendation: Add cleanup:

func (s *EVMCompatTestSuite) TearDownSuite() {
    ctx := context.Background()
    if s.sequencerNode != nil {
        _ = s.sequencerNode.Remove(ctx)
    }
    if s.fullNode != nil {
        _ = s.fullNode.Remove(ctx)
    }
    if s.sequencerClient != nil {
        s.sequencerClient.Close()
    }
    if s.fullNodeClient != nil {
        s.fullNodeClient.Close()
    }
}

6. TODO in Production Workflow (Priority: Medium)

Location: .github/workflows/docker-tests.yml:68

SEQUENCER_EVM_IMAGE_TAG: main # TODO: set this to the last released tag

Issue: Using main instead of the last stable release means the test doesn't verify actual backward compatibility with released versions.

Recommendation: Update to use the most recent release tag (appears to be v1.0.0-rc.1 based on changelog):

SEQUENCER_EVM_IMAGE_TAG: v1.0.0-rc.1

Fix this →

Code Quality & Architecture

7. Excellent Refactoring ✅

Location: test/docker-e2e/docker_test.go

Positive: The refactoring to extract shared EVM constants (evmTestChainID, evmTestPrivateKey, etc.) and helper functions (SetupEVMSingle, WaitForEVMHealthy) is exemplary:

✅ Reduces duplication
✅ Improves maintainability
✅ Follows DRY principle
✅ Makes tests more readable

The rename from RethSetup to RethSetupConfig is more idiomatic Go (config structs should have "Config" suffix).

8. Good Test Structure ✅

Location: test/docker-e2e/compat_test.go:45-92

Positive: The test is well-organized using subtests with descriptive names:

s.Run("setup_celestia_and_da", ...)
s.Run("setup_sequencer", ...)
s.Run("submit_transactions", ...)
s.Run("setup_fullnode_and_sync", ...)
s.Run("verify_sync", ...)

This makes test output clear and enables targeted re-runs. Follows best practices from CLAUDE.md: "Use table-driven tests where appropriate."

9. P2P Address Construction is Fragile (Priority: Low)

Location: test/docker-e2e/compat_test.go:172-189

Issue: The getSequencerP2PAddress function manually constructs a libp2p multiaddr by reading node_key.json, unmarshaling, and reconstructing the peer ID. This is tightly coupled to the node key format.

Observation: This is acceptable for tests, but consider whether ev-node has an API endpoint that returns the P2P multiaddr directly, which would be more maintainable.

Recommendation: If a direct API exists, use it:

// Check if ev-node exposes /net/info or similar endpoint
addr, err := s.sequencerNode.GetP2PAddress(ctx) // hypothetical

Security

10. Hardcoded Test Credentials ✅ (Priority: Informational)

Location: test/docker-e2e/docker_test.go:44

evmTestPrivateKey = "cece4f25ac74deb1468965160c7185e07dff413f23fcadb611b05ca37ab0a52e"

Status: Acceptable for test-only code since it's clearly marked as test credentials and not used in production. This follows CLAUDE.md's guidance to avoid exposing private keys in production, but test fixtures are exempt.

Performance & Resource Efficiency

11. Efficient Test Parallelization Opportunity (Priority: Low)

Location: .github/workflows/docker-tests.yml

Observation: The three Docker test jobs (docker-tests, docker-upgrade-tests, test-docker-compat) run sequentially but could run in parallel since they use different test filters and images.

Recommendation: Consider adding a needs: [] to allow parallel execution if CI resources permit:

jobs:
  docker-tests:
    # ... existing config
  
  docker-upgrade-tests:
    needs: []  # Run in parallel with docker-tests
    # ... existing config
  
  test-docker-compat:
    needs: []  # Run in parallel with other jobs
    # ... existing config

This would reduce CI runtime but increase resource usage.

Testing & Reliability

12. Good Coverage of Sync Scenarios ✅

Location: test/docker-e2e/compat_test.go:82-91

Positive: The test validates both historical sync (transactions before full node starts) AND ongoing sync (transactions after full node joins):

s.Run("verify_sync", func() {
    // Verify pre-sync transactions
    s.verifyTransactionsOnFullNode(ctx, preSyncTxHashes)
    
    // Submit more and verify ongoing sync
    postSyncTxHashes := s.submitTransactions(ctx, 5)
    latestHeight, err := s.sequencerClient.BlockNumber(ctx)
    s.Require().NoError(err)
    s.waitForSync(ctx, latestHeight)
    s.verifyTransactionsOnFullNode(ctx, postSyncTxHashes)
})

This comprehensive approach catches both historical state sync and live block processing issues.

13. Timeout Values Should Be Configurable (Priority: Low)

Location: Multiple locations

Observation: Timeouts are hardcoded throughout:

test/docker-e2e/compat_test.go:208: 120*time.Second
test/docker-e2e/docker_test.go:296: 60*time.Second
test/docker-e2e/docker_test.go:337: 120*time.Second

Recommendation: Consider extracting to constants or environment variables for CI environments with different performance characteristics:

const (
    defaultNodeStartTimeout  = 60 * time.Second
    defaultSyncTimeout       = 120 * time.Second
    defaultHealthCheckTimeout = 120 * time.Second
)

Documentation & Developer Experience

14. Missing Function Documentation (Priority: Low)

Location: Multiple helper functions

Issue: Several exported test helpers lack godoc comments:

SetupRethNode (line 271)
SetupCelestiaAndDABridge (line 308)
SetupEthClient (line 349)
WaitForTxIncluded (line 361)

Recommendation: Add documentation following CLAUDE.md: "Document exported types and functions":

// SetupRethNode creates, starts, and waits for a Reth execution client.
// Returns RethSetupConfig containing URLs, JWT secret, and genesis hash.
func (s *DockerTestSuite) SetupRethNode(ctx context.Context, name string) RethSetupConfig {

Dependency Updates Review

15. Go Version Bump ✅

Multiple go.mod files: go 1.25.0 → go 1.25.6

Status: Good. Patch version updates typically include bug fixes and security patches.

16. Dependency Version Updates ✅

Key updates in test/docker-e2e/go.mod:

github.com/celestiaorg/tastora → v0.10.0
github.com/libp2p/go-libp2p/core → v0.43.0-rc2 (new dependency)
github.com/docker/docker → v28.5.2+incompatible

Analysis:

✅ Tastora update aligns with the new test framework functionality
✅ libp2p/core import is now explicit (good for dependency clarity)
⚠️ Docker SDK major version jump (v27 → v28) - appears to be working based on passing CI

Recommendation: Verify Docker SDK API compatibility if any test failures occur.

Additional Observations

17. Build Tags Are Correct ✅

Location: test/docker-e2e/compat_test.go:1

//go:build docker_e2e && evm

Properly restricts test to Docker E2E runs with EVM support, preventing accidental execution in unit test runs.

18. Image Configuration Pattern ✅

Location: test/docker-e2e/compat_test.go:228-250

The getSequencerImage() and getFullNodeImage() functions follow the existing pattern from getEvNodeImage() in docker_test.go, using environment variables with sensible defaults. This is good for CI flexibility.

Summary Recommendations by Priority

Must Fix (High Priority)

✅ Replace fixed sleep with conditional wait for DA propagation (line 68)
✅ Add context timeout to prevent goroutine leaks (line 46)
✅ Update TODO in workflow to use last released tag (workflow line 68)

Should Fix (Medium Priority)

Add descriptive error messages to all Require().NoError() calls
Implement TearDownSuite() for proper test cleanup
Document nonce management threading assumptions

Nice to Have (Low Priority)

Extract timeout constants for configurability
Add godoc comments to exported test helpers
Consider P2P address API instead of manual construction
Enable parallel CI job execution if resources permit

Overall Assessment

Code Quality: ⭐⭐⭐⭐ (4/5) - Well-structured, good refactoring, solid test coverage

Production Readiness: ⭐⭐⭐ (3/5) - Needs fixes for test flakiness and resource cleanup

Adherence to Guidelines: ⭐⭐⭐⭐ (4/5) - Follows most CLAUDE.md patterns, minor gaps in error handling

Risk Level: Low-Medium - The main risks are test flakiness (fixed sleep) and resource leaks (no cleanup), but these don't affect production code.

This is a solid PR that adds valuable compatibility testing infrastructure. The refactoring work is excellent and will benefit future test development. Addressing the high-priority items will make the tests more reliable and maintainable.

github-actions · 2026-01-28T10:20:47Z

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

Build	Format	Lint	Breaking	Updated (UTC)
`✅ passed`	`⏩ skipped`	`✅ passed`	`✅ passed`	Jan 29, 2026, 2:20 PM

gemini-code-assist · 2026-01-28T10:20:49Z

Summary of Changes

Hello @chatton, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the testing infrastructure by introducing a robust Docker E2E compatibility test. This new test ensures that different versions of the ev-node-evm can seamlessly synchronize, specifically verifying that a newer full node can correctly process and validate data from an older sequencer. The changes also include refactoring of test utilities to promote cleaner, more maintainable code and updating core dependencies to their latest versions, contributing to overall system stability and future compatibility.

Highlights

New Compatibility Test Suite: A new Docker E2E test suite, EVMCompatTestSuite, has been introduced to verify cross-version syncing capabilities of the ev-node-evm. This test deploys an older sequencer image, executes transactions, and then deploys the current PR image as a full node to ensure it can correctly sync and validate the historical transactions.
Docker E2E Test Utilities Refactoring: Existing Docker E2E test utilities have been refactored for improved reusability and clarity. This includes defining shared EVM test constants, renaming RethSetup to RethSetupConfig, and introducing EVMSingleSetupConfig and SetupEVMSingle helper functions to streamline node setup.
Dependency and Go Version Updates: The Go version has been updated from 1.25.0 to 1.25.6, and several Go module dependencies, including github.com/fatih/color, github.com/google/flatbuffers, github.com/hashicorp/go-hclog, and github.com/celestiaorg/tastora, have been updated to newer versions. New dependencies like github.com/libp2p/go-libp2p/core were also added.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Ignored Files

Ignored by pattern: .github/workflows/** (1)
- .github/workflows/docker-tests.yml

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a valuable compatibility test suite to ensure syncing works across different node versions. The new EVMCompatTestSuite effectively simulates a real-world scenario where a new full node syncs from a sequencer running a different version. The changes also include beneficial refactoring of the existing test suite, such as centralizing test constants and creating reusable helper functions for setting up test nodes. My review includes one suggestion to improve the robustness of the new test by replacing a fixed sleep with a conditional wait.

gemini-code-assist · 2026-01-28T10:23:36Z

test/docker-e2e/compat_test.go

+	time.Sleep(5 * time.Second)
+
+	sequencerHeight, err := s.sequencerClient.BlockNumber(ctx)
+	s.Require().NoError(err)


Using a fixed time.Sleep can introduce flakiness into tests, as the time required for an operation can vary between test runs. A more robust approach is to wait for a specific condition. Here, you're waiting for blocks to be posted to the DA. While directly querying the DA might be complex, you can achieve a more reliable wait by ensuring a few more blocks are produced by the sequencer. This provides a time window for DA propagation that is tied to block production rather than a fixed duration.

s.T().Log("Waiting for a few blocks to allow for DA propagation...") lastTxHeight, err := s.sequencerClient.BlockNumber(ctx) s.Require().NoError(err) const daPropagationBlocks = 3 targetHeight := lastTxHeight + daPropagationBlocks s.Require().Eventually(func() bool { h, err := s.sequencerClient.BlockNumber(ctx) return err == nil && h >= targetHeight }, 15*time.Second, time.Second, "sequencer did not produce new blocks for DA propagation") sequencerHeight, err := s.sequencerClient.BlockNumber(ctx) s.Require().NoError(err)

codecov · 2026-01-28T10:39:49Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 55.51%. Comparing base (bcd3375) to head (f0fdb39).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #3023      +/-   ##
==========================================
- Coverage   55.54%   55.51%   -0.04%     
==========================================
  Files         116      116              
  Lines       11507    11507              
==========================================
- Hits         6392     6388       -4     
- Misses       4413     4417       +4     
  Partials      702      702

Flag	Coverage Δ
combined	`55.51% <ø> (-0.04%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

chatton added 6 commits January 28, 2026 09:12

chore: refactor to use evm config struct

70e1667

test: adding compatibility test to ensure one version can sync to ano…

8674cb5

…ther

chore: adding comparison to sequnecer block number

e8e98dc

wip: bumping tastora version

368d0a3

chore: adding workflow to run test

fda381c

github-actions bot assigned chatton Jan 28, 2026

gemini-code-assist bot reviewed Jan 28, 2026

View reviewed changes

chore: go mod tidy all

2f65f85

chatton added 5 commits January 28, 2026 10:50

Merge branch 'main' into cian/compat-test

22a4b0b

chore: go mod tidy fix

f001abf

Merge branch 'main' into cian/compat-test

37cd5b3

chore: run correct test

18e81c6

Merge branch 'main' into cian/compat-test

f0fdb39

chatton marked this pull request as ready for review January 29, 2026 14:54

julienrbrt approved these changes Jan 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: adding compatibility test to perform syncing #3023

test: adding compatibility test to perform syncing #3023

Uh oh!

chatton commented Jan 28, 2026

Uh oh!

claude bot commented Jan 28, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 28, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Jan 28, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 28, 2026

Uh oh!

codecov bot commented Jan 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

test: adding compatibility test to perform syncing #3023

Are you sure you want to change the base?

test: adding compatibility test to perform syncing #3023

Uh oh!

Conversation

chatton commented Jan 28, 2026

Overview

Uh oh!

claude bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Complete

Tasks

Summary

Critical Issues

1. Test Flakiness Risk - Fixed Sleep (Priority: High)

2. Goroutine Leak Risk - No Context Cancellation (Priority: High)

3. Incomplete Error Context (Priority: Medium)

Moderate Issues

4. Race Condition in Transaction Nonce Management (Priority: Medium)

5. Missing Test Isolation (Priority: Medium)

6. TODO in Production Workflow (Priority: Medium)

Code Quality & Architecture

7. Excellent Refactoring ✅

8. Good Test Structure ✅

9. P2P Address Construction is Fragile (Priority: Low)

Security

10. Hardcoded Test Credentials ✅ (Priority: Informational)

Performance & Resource Efficiency

11. Efficient Test Parallelization Opportunity (Priority: Low)

Testing & Reliability

12. Good Coverage of Sync Scenarios ✅

13. Timeout Values Should Be Configurable (Priority: Low)

Documentation & Developer Experience

14. Missing Function Documentation (Priority: Low)

Dependency Updates Review

15. Go Version Bump ✅

16. Dependency Version Updates ✅

Additional Observations

17. Build Tags Are Correct ✅

18. Image Configuration Pattern ✅

Summary Recommendations by Priority

Must Fix (High Priority)

Should Fix (Medium Priority)

Nice to Have (Low Priority)

Overall Assessment

Uh oh!

github-actions bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Jan 28, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

claude bot commented Jan 28, 2026 •

edited

Loading

github-actions bot commented Jan 28, 2026 •

edited

Loading

codecov bot commented Jan 28, 2026 •

edited

Loading