Skip to content

feat(gc): add sync protection mechanism for cross-cluster sync#23675

Open
LeftHandCold wants to merge 19 commits intomatrixorigin:mainfrom
LeftHandCold:sync_protection
Open

feat(gc): add sync protection mechanism for cross-cluster sync#23675
LeftHandCold wants to merge 19 commits intomatrixorigin:mainfrom
LeftHandCold:sync_protection

Conversation

@LeftHandCold
Copy link
Contributor

@LeftHandCold LeftHandCold commented Feb 4, 2026

User description

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #23525

What this PR does / why we need it:

This commit implements a sync protection mechanism to prevent GC from
deleting files that are being synchronized across clusters.

Key changes:

  • Add SyncProtectionManager to manage sync protection entries with
    BloomFilter-based file matching
  • Use index.BloomFilter (xorfilter-based, deterministic) for consistent
    cross-process file matching
  • Integrate sync protection into GC pipeline via MakeBloomfilterCoarseFilter
    so protected files stay in filesNotGC
  • Add mo_ctl handlers: register_sync_protection, renew_sync_protection,
    unregister_sync_protection
  • Add dedicated error codes for sync protection (20642-20647)
  • Add BVT test case for mo_ctl sync protection commands
  • Add mo-tool gc sync-protection command for integration testing

Protection workflow:

  1. Sync job registers protection with BloomFilter before sync starts
  2. GC checks each file against BloomFilter, protected files skip deletion
  3. Sync job unregisters (soft delete) after sync completes
  4. Soft-deleted protections cleaned up when scanWaterMark > validTS

Safety features:

  • Block new registrations while GC is running
  • TTL-based cleanup for crashed sync jobs (default 20 minutes)
  • Max protection count limit (default 100)
  • Soft delete ensures checkpoint records file state before cleanup

PR Type

Enhancement, Tests


Description

  • Implements a comprehensive sync protection mechanism to prevent GC from deleting files being synchronized across clusters

  • Core SyncProtectionManager uses index.BloomFilter (xorfilter-based, deterministic) for cross-process file matching

  • Integrates sync protection into GC pipeline via MakeBloomfilterCoarseFilter to keep protected files in filesNotGC list

  • Adds three mo_ctl command handlers: RegisterSyncProtection, RenewSyncProtection, UnregisterSyncProtection with JSON request parsing

  • Defines six new error codes (20642-20647) for sync protection operations: ErrGCIsRunning, ErrSyncProtectionNotFound, ErrSyncProtectionExists, ErrSyncProtectionMaxCount, ErrSyncProtectionSoftDelete, ErrSyncProtectionInvalid

  • Implements protection workflow: register before sync → GC checks BloomFilter → unregister after sync → cleanup soft-deleted entries when checkpoint watermark exceeds validTS

  • Includes safety features: blocks new registrations during GC execution, TTL-based cleanup for crashed jobs (20 min default), max protection count limit (100 default)

  • Adds comprehensive unit tests (20+ test cases) covering registration, renewal, unregistration, cleanup, concurrent access, and edge cases

  • Includes BVT test suite validating mo_ctl command error handling and graceful failures

  • Provides CLI testing tool for end-to-end validation of sync protection mechanism


Diagram Walkthrough

flowchart LR
  A["Sync Job"] -->|register_sync_protection| B["SyncProtectionManager"]
  B -->|store BloomFilter| C["Protection Registry"]
  D["GC Pipeline"] -->|check file| B
  B -->|BloomFilter match| E["filesNotGC"]
  A -->|unregister_sync_protection| B
  B -->|soft delete| F["Cleanup Queue"]
  D -->|checkpoint watermark| F
  F -->|cleanup expired| G["Delete Protection"]
Loading

File Walkthrough

Relevant files
Code generation
1 files
operations.pb.go
Protobuf code generation for SyncProtection message type 

pkg/vm/engine/cmd_util/operations.pb.go

  • Added protobuf message methods for SyncProtection struct (Reset,
    String, Marshal, Unmarshal, etc.)
  • Implemented serialization/deserialization methods for sync protection
    protocol buffer messages
  • Updated file descriptor with new SyncProtection message type
    registration
  • Added getter methods for SyncProtection fields: Op, JobID, Objects,
    ValidTS
+383/-76
Tests
5 files
sync_protection_test.go
Unit tests for sync protection manager functionality         

pkg/vm/engine/tae/db/gc/v3/sync_protection_test.go

  • Comprehensive test suite for SyncProtectionManager with 20+ test cases
  • Tests cover registration, renewal, unregistration, cleanup, and file
    filtering operations
  • Validates BloomFilter-based protection mechanism using
    index.BloomFilter (xorfilter)
  • Tests edge cases: concurrent access, max count limits, TTL expiration,
    soft delete cleanup
+491/-0 
main.go
Create GC testing tool for sync protection                             

pkg/vm/engine/tae/db/gc/v3/tool/main.go

  • Created new GC testing tool with cobra CLI framework
  • Implemented main entry point for gc-tool command
  • Added PrepareSyncProtectionCommand() subcommand for sync protection
    testing
+35/-0   
mock_cleaner.go
Implement sync protection method in mock cleaner                 

pkg/vm/engine/tae/db/gc/v3/mock_cleaner.go

  • Implemented GetSyncProtectionManager() method in MockCleaner
  • Returns nil for mock implementation
+4/-0     
mo_ctl_sync_protection.test
Add BVT tests for sync protection commands                             

test/distributed/cases/function/mo_ctl/mo_ctl_sync_protection.test

  • Created BVT test file with seven test cases for sync protection mo_ctl
    commands
  • Tests cover invalid JSON, missing fields, invalid base64, non-existent
    protections, empty commands, and unknown operations
  • Validates error handling and graceful failure scenarios
+23/-0   
mo_ctl_sync_protection.result
Add expected results for sync protection tests                     

test/distributed/cases/function/mo_ctl/mo_ctl_sync_protection.result

  • Added expected test results for all seven sync protection test cases
  • Includes error messages for invalid arguments, missing fields,
    non-existent protections, and invalid operations
+14/-0   
Enhancement
11 files
sync_protection.go
CLI tool for sync protection testing and validation           

pkg/vm/engine/tae/db/gc/v3/tool/sync_protection.go

  • Command-line tool for testing sync protection mechanism end-to-end
  • Scans object files, builds BloomFilter, registers/renews/unregisters
    protections
  • Triggers GC and verifies protected files are not deleted
  • Provides verbose output and configurable parameters (sample count,
    wait time, DSN)
+501/-0 
sync_protection.go
Sync protection manager with BloomFilter-based file matching

pkg/vm/engine/tae/db/gc/v3/sync_protection.go

  • Core SyncProtectionManager implementation managing sync protection
    entries
  • Uses index.BloomFilter (xorfilter-based, deterministic) for
    cross-process file matching
  • Implements registration, renewal, unregistration (soft delete), and
    cleanup operations
  • Provides file filtering to prevent GC deletion of protected objects
+376/-0 
checkpoint.go
Integration of sync protection into GC checkpoint cleaner

pkg/vm/engine/tae/db/gc/v3/checkpoint.go

  • Integrated SyncProtectionManager into checkpointCleaner struct
  • Added sync protection filtering before file deletion in GC pipeline
  • Implemented cleanup of soft-deleted protections when checkpoint
    watermark exceeds validTS
  • Set GC running state to block new registrations during GC execution
+59/-0   
exec_v1.go
Sync protection integration into GC job execution               

pkg/vm/engine/tae/db/gc/v3/exec_v1.go

  • Added syncProtection field to CheckpointBasedGCJob struct
  • Passed SyncProtectionManager through GC job creation and execution
    pipeline
  • Modified MakeBloomfilterCoarseFilter to check protected files and skip
    marking them for GC
  • Protected files remain in filesNotGC list, preventing their deletion
+33/-19 
handle_debug.go
Add sync protection mo_ctl command handlers                           

pkg/vm/engine/tae/rpc/handle_debug.go

  • Added encoding/json import for JSON unmarshaling
  • Implemented three new mo_ctl handlers: RegisterSyncProtection,
    RenewSyncProtection, and UnregisterSyncProtection
  • Each handler parses JSON request containing job_id, BloomFilter data,
    and valid timestamp
  • Handlers delegate to SyncProtectionManager methods and return JSON
    status responses
+66/-0   
cmd_disk_cleaner.go
Add sync protection command parsing and validation             

pkg/sql/plan/function/ctl/cmd_disk_cleaner.go

  • Added logutil and zap imports for logging
  • Modified parameter validation to allow sync protection operations
    without strict length limits
  • Added handling for RegisterSyncProtection, RenewSyncProtection, and
    UnregisterSyncProtection operations
  • Implemented JSON value parsing by joining dot-separated parameters to
    handle JSON with embedded dots
  • Added debug logging for sync protection command parsing
+30/-2   
window.go
Integrate sync protection into GC window execution             

pkg/vm/engine/tae/db/gc/v3/window.go

  • Added syncProtection parameter of type *SyncProtectionManager to
    ExecuteGlobalCheckpointBasedGC method
  • Passed syncProtection parameter through to downstream function calls
+2/-0     
operations.go
Define SyncProtection request structure                                   

pkg/vm/engine/cmd_util/operations.go

  • Added new SyncProtection struct for sync protection requests
  • Defined fields: JobID (sync job identifier), BF (base64-encoded
    BloomFilter), ValidTS (timestamp in nanoseconds), TestObject (optional
    debugging field)
+8/-0     
type.go
Add sync protection operation constants                                   

pkg/vm/engine/cmd_util/type.go

  • Added three new operation constants for sync protection:
    RegisterSyncProtection, RenewSyncProtection, UnregisterSyncProtection
+5/-0     
types.go
Add sync protection manager interface method                         

pkg/vm/engine/tae/db/gc/v3/types.go

  • Added GetSyncProtectionManager() method to Cleaner interface
  • Returns pointer to SyncProtectionManager for accessing sync protection
    functionality
+3/-0     
operations.proto
Define SyncProtection protobuf message                                     

pkg/vm/engine/cmd_util/operations.proto

  • Added SyncProtection protobuf message definition
  • Defined fields: Op (operation type), JobID (sync job identifier),
    Objects (repeated protected object names), ValidTS (valid timestamp)
+8/-0     
Error handling
2 files
error_no_ctx.go
Error definitions for sync protection operations                 

pkg/common/moerr/error_no_ctx.go

  • Added 5 new error constructors for sync protection: ErrGCIsRunning,
    ErrSyncProtectionNotFound, ErrSyncProtectionExists,
    ErrSyncProtectionMaxCount, ErrSyncProtectionSoftDelete,
    ErrSyncProtectionInvalid
  • Error codes cover all sync protection failure scenarios
  • Follows existing error handling pattern with context-less constructors
+25/-0   
error.go
Define sync protection error codes and messages                   

pkg/common/moerr/error.go

  • Added six new error codes (20642-20647) for sync protection operations
  • Defined error types: ErrGCIsRunning, ErrSyncProtectionNotFound,
    ErrSyncProtectionExists, ErrSyncProtectionMaxCount,
    ErrSyncProtectionSoftDelete, ErrSyncProtectionInvalid
  • Added corresponding error messages in errorMsgRefer map with
    descriptive text
+16/-0   

This commit implements a sync protection mechanism to prevent GC from
deleting files that are being synchronized across clusters.

Key changes:
- Add SyncProtectionManager to manage sync protection entries
- Use index.BloomFilter (xorfilter-based, deterministic) instead of
  bloomfilter.BloomFilter (wyhash-based, non-deterministic)
- Integrate sync protection check into MakeBloomfilterCoarseFilter
  so protected files stay in filesNotGC (recorded in GC metadata)
- Add mo_ctl handlers for register/renew/unregister sync protection
- Add mo-tool sync-protection command for testing

The protection is applied at the coarse filter stage to ensure:
1. Protected files are recorded in GC window metadata
2. Protected files are not deleted during GC
3. After protection is released, files can be GC'd normally
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/feature Review effort 4/5 size/XL Denotes a PR that changes [1000, 1999] lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants