Skip to content

Conversation

@chatton
Copy link
Contributor

@chatton chatton commented Jan 28, 2026

Overview

This PR adds a compatibility test that will deploy a specified image, execute some transactions, then deploy the PR image and verify that syncing still works correctly.

closes #3019

Add EVMCompatTestSuite to test syncing between different ev-node
versions. The test:
- Starts sequencer with version A (env: SEQUENCER_EVM_IMAGE_*)
- Submits transactions
- Starts full node with version B (env: FULLNODE_EVM_IMAGE_*)
- Verifies full node syncs and can query transactions
- Submits post-sync transactions to verify ongoing sync

Add make test-docker-compat target for running the test.
@claude
Copy link
Contributor

claude bot commented Jan 28, 2026

Claude finished @chatton's task in 2m 44s —— View job


Code Review Complete

I've conducted a comprehensive review of this PR, which adds a valuable compatibility test to ensure cross-version syncing works correctly. Here are my findings:

Tasks

  • Review repository guidelines (CLAUDE.md)
  • Review new compatibility test suite
  • Review Docker E2E utility refactoring
  • Check dependency updates
  • Assess error handling and resilience patterns
  • Review test quality and coverage
  • Provide comprehensive feedback

Summary

This PR introduces a robust Docker E2E compatibility test that verifies a newer full node can sync from an older sequencer. The implementation is generally solid, with good test structure and helpful utility refactoring. However, there are several areas that could be improved for production readiness.

Critical Issues

1. Test Flakiness Risk - Fixed Sleep (Priority: High)

Location: test/docker-e2e/compat_test.go:68

The test uses a fixed 5-second sleep to wait for DA propagation:

// wait for blocks to be posted to DA before starting full node
time.Sleep(5 * time.Second)

Issue: Fixed sleeps introduce flakiness as timing can vary across environments (CI, local, different machine specs). This violates the CLAUDE.md guideline: "Ensure tests are deterministic."

Recommendation: Replace with a conditional wait based on observable state (similar to Gemini's suggestion). Wait for additional blocks to be produced, which provides implicit confirmation that DA propagation has occurred:

s.Run("wait_for_da_propagation", func() {
    s.T().Log("Waiting for DA propagation by observing block production...")
    initialHeight, err := s.sequencerClient.BlockNumber(ctx)
    s.Require().NoError(err)
    
    const blockMargin = 3
    targetHeight := initialHeight + blockMargin
    
    s.Require().Eventually(func() bool {
        height, err := s.sequencerClient.BlockNumber(ctx)
        return err == nil && height >= targetHeight
    }, 30*time.Second, time.Second, 
        "sequencer did not produce %d blocks for DA propagation", blockMargin)
    
    s.T().Logf("DA propagation confirmed via block production to height %d", targetHeight)
})

Fix this →


2. Goroutine Leak Risk - No Context Cancellation (Priority: High)

Location: test/docker-e2e/compat_test.go:45-92

Issue: The test uses context.Background() without any cancellation or timeout mechanism. If the test hangs (network issues, DA unavailability, container problems), goroutines and resources will leak. This violates CLAUDE.md: "Be mindful of goroutine leaks."

Recommendation: Use a context with timeout:

func (s *EVMCompatTestSuite) TestCrossVersionSync() {
    ctx, cancel := context.WithTimeout(context.Background(), 10*time.Minute)
    defer cancel()
    
    s.setupDockerEnvironment()
    // ... rest of test
}

Fix this →


3. Incomplete Error Context (Priority: Medium)

Location: test/docker-e2e/compat_test.go:94-165

Issue: Several error assertions lack sufficient context for debugging failures:

s.Require().NoError(err)  // Line 120, 123, 158, 161

Recommendation: Add descriptive error messages following CLAUDE.md: "Wrap errors with context using fmt.Errorf":

s.Require().NoError(err, "failed to build sequencer chain with image %s:%s", 
    image.Repository, image.Version)
s.Require().NoError(s.sequencerNode.Start(ctx), 
    "failed to start sequencer node")
s.Require().NoError(s.fullNode.WriteFile(ctx, "config/genesis.json", genesis),
    "failed to write genesis.json to full node")

Fix this →


Moderate Issues

4. Race Condition in Transaction Nonce Management (Priority: Medium)

Location: test/docker-e2e/compat_test.go:191-202

Issue: The submitTransactions function mutates s.txNonce which is shared state. If tests are run concurrently (via t.Parallel()), this could cause nonce collisions.

Current code:

tx := evm.GetRandomTransaction(s.T(), evmTestPrivateKey, evmTestToAddress, 
    evmTestChainID, evmTestGasLimit, &s.txNonce)

Observation: Currently safe since subtests use s.Run() (sequential) not t.Run() (parallel). However, CLAUDE.md emphasizes: "Be careful with concurrent access to shared state."

Recommendation: Document this assumption or use atomic operations if parallel execution is planned:

// submitTransactions submits count transactions sequentially.
// Note: Not safe for concurrent use due to shared nonce state.
func (s *EVMCompatTestSuite) submitTransactions(ctx context.Context, count int) []common.Hash {

5. Missing Test Isolation (Priority: Medium)

Location: test/docker-e2e/compat_test.go:38-43

Issue: The test suite doesn't clean up resources (Docker containers, volumes) in TearDown methods. If the test fails mid-execution, containers and volumes will persist, potentially interfering with subsequent test runs.

Recommendation: Add cleanup:

func (s *EVMCompatTestSuite) TearDownSuite() {
    ctx := context.Background()
    if s.sequencerNode != nil {
        _ = s.sequencerNode.Remove(ctx)
    }
    if s.fullNode != nil {
        _ = s.fullNode.Remove(ctx)
    }
    if s.sequencerClient != nil {
        s.sequencerClient.Close()
    }
    if s.fullNodeClient != nil {
        s.fullNodeClient.Close()
    }
}

6. TODO in Production Workflow (Priority: Medium)

Location: .github/workflows/docker-tests.yml:68

SEQUENCER_EVM_IMAGE_TAG: main # TODO: set this to the last released tag

Issue: Using main instead of the last stable release means the test doesn't verify actual backward compatibility with released versions.

Recommendation: Update to use the most recent release tag (appears to be v1.0.0-rc.1 based on changelog):

SEQUENCER_EVM_IMAGE_TAG: v1.0.0-rc.1

Fix this →


Code Quality & Architecture

7. Excellent Refactoring

Location: test/docker-e2e/docker_test.go

Positive: The refactoring to extract shared EVM constants (evmTestChainID, evmTestPrivateKey, etc.) and helper functions (SetupEVMSingle, WaitForEVMHealthy) is exemplary:

  • ✅ Reduces duplication
  • ✅ Improves maintainability
  • ✅ Follows DRY principle
  • ✅ Makes tests more readable

The rename from RethSetup to RethSetupConfig is more idiomatic Go (config structs should have "Config" suffix).


8. Good Test Structure

Location: test/docker-e2e/compat_test.go:45-92

Positive: The test is well-organized using subtests with descriptive names:

s.Run("setup_celestia_and_da", ...)
s.Run("setup_sequencer", ...)
s.Run("submit_transactions", ...)
s.Run("setup_fullnode_and_sync", ...)
s.Run("verify_sync", ...)

This makes test output clear and enables targeted re-runs. Follows best practices from CLAUDE.md: "Use table-driven tests where appropriate."


9. P2P Address Construction is Fragile (Priority: Low)

Location: test/docker-e2e/compat_test.go:172-189

Issue: The getSequencerP2PAddress function manually constructs a libp2p multiaddr by reading node_key.json, unmarshaling, and reconstructing the peer ID. This is tightly coupled to the node key format.

Observation: This is acceptable for tests, but consider whether ev-node has an API endpoint that returns the P2P multiaddr directly, which would be more maintainable.

Recommendation: If a direct API exists, use it:

// Check if ev-node exposes /net/info or similar endpoint
addr, err := s.sequencerNode.GetP2PAddress(ctx) // hypothetical

Security

10. Hardcoded Test Credentials ✅ (Priority: Informational)

Location: test/docker-e2e/docker_test.go:44

evmTestPrivateKey = "cece4f25ac74deb1468965160c7185e07dff413f23fcadb611b05ca37ab0a52e"

Status: Acceptable for test-only code since it's clearly marked as test credentials and not used in production. This follows CLAUDE.md's guidance to avoid exposing private keys in production, but test fixtures are exempt.


Performance & Resource Efficiency

11. Efficient Test Parallelization Opportunity (Priority: Low)

Location: .github/workflows/docker-tests.yml

Observation: The three Docker test jobs (docker-tests, docker-upgrade-tests, test-docker-compat) run sequentially but could run in parallel since they use different test filters and images.

Recommendation: Consider adding a needs: [] to allow parallel execution if CI resources permit:

jobs:
  docker-tests:
    # ... existing config
  
  docker-upgrade-tests:
    needs: []  # Run in parallel with docker-tests
    # ... existing config
  
  test-docker-compat:
    needs: []  # Run in parallel with other jobs
    # ... existing config

This would reduce CI runtime but increase resource usage.


Testing & Reliability

12. Good Coverage of Sync Scenarios

Location: test/docker-e2e/compat_test.go:82-91

Positive: The test validates both historical sync (transactions before full node starts) AND ongoing sync (transactions after full node joins):

s.Run("verify_sync", func() {
    // Verify pre-sync transactions
    s.verifyTransactionsOnFullNode(ctx, preSyncTxHashes)
    
    // Submit more and verify ongoing sync
    postSyncTxHashes := s.submitTransactions(ctx, 5)
    latestHeight, err := s.sequencerClient.BlockNumber(ctx)
    s.Require().NoError(err)
    s.waitForSync(ctx, latestHeight)
    s.verifyTransactionsOnFullNode(ctx, postSyncTxHashes)
})

This comprehensive approach catches both historical state sync and live block processing issues.


13. Timeout Values Should Be Configurable (Priority: Low)

Location: Multiple locations

Observation: Timeouts are hardcoded throughout:

  • test/docker-e2e/compat_test.go:208: 120*time.Second
  • test/docker-e2e/docker_test.go:296: 60*time.Second
  • test/docker-e2e/docker_test.go:337: 120*time.Second

Recommendation: Consider extracting to constants or environment variables for CI environments with different performance characteristics:

const (
    defaultNodeStartTimeout  = 60 * time.Second
    defaultSyncTimeout       = 120 * time.Second
    defaultHealthCheckTimeout = 120 * time.Second
)

Documentation & Developer Experience

14. Missing Function Documentation (Priority: Low)

Location: Multiple helper functions

Issue: Several exported test helpers lack godoc comments:

  • SetupRethNode (line 271)
  • SetupCelestiaAndDABridge (line 308)
  • SetupEthClient (line 349)
  • WaitForTxIncluded (line 361)

Recommendation: Add documentation following CLAUDE.md: "Document exported types and functions":

// SetupRethNode creates, starts, and waits for a Reth execution client.
// Returns RethSetupConfig containing URLs, JWT secret, and genesis hash.
func (s *DockerTestSuite) SetupRethNode(ctx context.Context, name string) RethSetupConfig {

Dependency Updates Review

15. Go Version Bump

Multiple go.mod files: go 1.25.0go 1.25.6

Status: Good. Patch version updates typically include bug fixes and security patches.


16. Dependency Version Updates

Key updates in test/docker-e2e/go.mod:

  • github.com/celestiaorg/tastorav0.10.0
  • github.com/libp2p/go-libp2p/corev0.43.0-rc2 (new dependency)
  • github.com/docker/dockerv28.5.2+incompatible

Analysis:

  • ✅ Tastora update aligns with the new test framework functionality
  • ✅ libp2p/core import is now explicit (good for dependency clarity)
  • ⚠️ Docker SDK major version jump (v27 → v28) - appears to be working based on passing CI

Recommendation: Verify Docker SDK API compatibility if any test failures occur.


Additional Observations

17. Build Tags Are Correct

Location: test/docker-e2e/compat_test.go:1

//go:build docker_e2e && evm

Properly restricts test to Docker E2E runs with EVM support, preventing accidental execution in unit test runs.


18. Image Configuration Pattern

Location: test/docker-e2e/compat_test.go:228-250

The getSequencerImage() and getFullNodeImage() functions follow the existing pattern from getEvNodeImage() in docker_test.go, using environment variables with sensible defaults. This is good for CI flexibility.


Summary Recommendations by Priority

Must Fix (High Priority)

  1. Replace fixed sleep with conditional wait for DA propagation (line 68)
  2. Add context timeout to prevent goroutine leaks (line 46)
  3. Update TODO in workflow to use last released tag (workflow line 68)

Should Fix (Medium Priority)

  1. Add descriptive error messages to all Require().NoError() calls
  2. Implement TearDownSuite() for proper test cleanup
  3. Document nonce management threading assumptions

Nice to Have (Low Priority)

  1. Extract timeout constants for configurability
  2. Add godoc comments to exported test helpers
  3. Consider P2P address API instead of manual construction
  4. Enable parallel CI job execution if resources permit

Overall Assessment

Code Quality: ⭐⭐⭐⭐ (4/5) - Well-structured, good refactoring, solid test coverage

Production Readiness: ⭐⭐⭐ (3/5) - Needs fixes for test flakiness and resource cleanup

Adherence to Guidelines: ⭐⭐⭐⭐ (4/5) - Follows most CLAUDE.md patterns, minor gaps in error handling

Risk Level: Low-Medium - The main risks are test flakiness (fixed sleep) and resource leaks (no cleanup), but these don't affect production code.


This is a solid PR that adds valuable compatibility testing infrastructure. The refactoring work is excellent and will benefit future test development. Addressing the high-priority items will make the tests more reliable and maintainable.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 28, 2026

The latest Buf updates on your PR. Results from workflow CI / buf-check (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed⏩ skipped✅ passed✅ passedJan 29, 2026, 2:20 PM

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @chatton, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the testing infrastructure by introducing a robust Docker E2E compatibility test. This new test ensures that different versions of the ev-node-evm can seamlessly synchronize, specifically verifying that a newer full node can correctly process and validate data from an older sequencer. The changes also include refactoring of test utilities to promote cleaner, more maintainable code and updating core dependencies to their latest versions, contributing to overall system stability and future compatibility.

Highlights

  • New Compatibility Test Suite: A new Docker E2E test suite, EVMCompatTestSuite, has been introduced to verify cross-version syncing capabilities of the ev-node-evm. This test deploys an older sequencer image, executes transactions, and then deploys the current PR image as a full node to ensure it can correctly sync and validate the historical transactions.
  • Docker E2E Test Utilities Refactoring: Existing Docker E2E test utilities have been refactored for improved reusability and clarity. This includes defining shared EVM test constants, renaming RethSetup to RethSetupConfig, and introducing EVMSingleSetupConfig and SetupEVMSingle helper functions to streamline node setup.
  • Dependency and Go Version Updates: The Go version has been updated from 1.25.0 to 1.25.6, and several Go module dependencies, including github.com/fatih/color, github.com/google/flatbuffers, github.com/hashicorp/go-hclog, and github.com/celestiaorg/tastora, have been updated to newer versions. New dependencies like github.com/libp2p/go-libp2p/core were also added.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/docker-tests.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable compatibility test suite to ensure syncing works across different node versions. The new EVMCompatTestSuite effectively simulates a real-world scenario where a new full node syncs from a sequencer running a different version. The changes also include beneficial refactoring of the existing test suite, such as centralizing test constants and creating reusable helper functions for setting up test nodes. My review includes one suggestion to improve the robustness of the new test by replacing a fixed sleep with a conditional wait.

Comment on lines +68 to +71
time.Sleep(5 * time.Second)

sequencerHeight, err := s.sequencerClient.BlockNumber(ctx)
s.Require().NoError(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using a fixed time.Sleep can introduce flakiness into tests, as the time required for an operation can vary between test runs. A more robust approach is to wait for a specific condition. Here, you're waiting for blocks to be posted to the DA. While directly querying the DA might be complex, you can achieve a more reliable wait by ensuring a few more blocks are produced by the sequencer. This provides a time window for DA propagation that is tied to block production rather than a fixed duration.

s.T().Log("Waiting for a few blocks to allow for DA propagation...")
	lastTxHeight, err := s.sequencerClient.BlockNumber(ctx)
	s.Require().NoError(err)
	const daPropagationBlocks = 3
	targetHeight := lastTxHeight + daPropagationBlocks
	s.Require().Eventually(func() bool {
		h, err := s.sequencerClient.BlockNumber(ctx)
		return err == nil && h >= targetHeight
	}, 15*time.Second, time.Second, "sequencer did not produce new blocks for DA propagation")

	sequencerHeight, err := s.sequencerClient.BlockNumber(ctx)
	s.Require().NoError(err)

@codecov
Copy link

codecov bot commented Jan 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 55.51%. Comparing base (bcd3375) to head (f0fdb39).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3023      +/-   ##
==========================================
- Coverage   55.54%   55.51%   -0.04%     
==========================================
  Files         116      116              
  Lines       11507    11507              
==========================================
- Hits         6392     6388       -4     
- Misses       4413     4417       +4     
  Partials      702      702              
Flag Coverage Δ
combined 55.51% <ø> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@chatton chatton marked this pull request as ready for review January 29, 2026 14:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[E2E] Sync Compatibility Test

3 participants