Skip to content

Conversation

@winterhazel
Copy link
Member

Description

The secondary storage selectors allow operators to specify, for instance, that volumes should go to a specific secondary storage A. Thus, when uploading a volume, it will always be downloaded to secondary storage A.

The cold volume migration moves volumes to a secondary storage before moving them to the destination primary storage. This process does not consider the secondary storage selectors. However, some companies want to dedicate specific secondary storages for cold migration.

To address this, this PR makes the cold volume migration process consider the secondary storage selectors.

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

How Has This Been Tested?

  1. Without any secondary storage selector, I began the cold migration of a volume. I validated that the most free secondary storage was used for migration.

  2. I created a secondary storage selector directing volumes to a specific secondary storage, and began the cold migration of another volume. I validated that the specified secondary storage was used for the migration.

@winterhazel winterhazel changed the title Consider secondary storage selectors during template synchronization Consider secondary storage selectors during cold volume migration Jun 4, 2025
@winterhazel
Copy link
Member Author

@blueorangutan package

@blueorangutan
Copy link

@winterhazel a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@codecov
Copy link

codecov bot commented Jun 4, 2025

Codecov Report

❌ Patch coverage is 61.53846% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 16.14%. Comparing base (823080c) to head (7071461).
⚠️ Report is 266 commits behind head on 4.20.

Files with missing lines Patch % Lines
...tack/storage/motion/AncientDataMotionStrategy.java 0.00% 5 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               4.20   #10957      +/-   ##
============================================
- Coverage     16.14%   16.14%   -0.01%     
- Complexity    13253    13255       +2     
============================================
  Files          5656     5656              
  Lines        497893   497897       +4     
  Branches      60374    60375       +1     
============================================
- Hits          80405    80401       -4     
- Misses       408529   408536       +7     
- Partials       8959     8960       +1     
Flag Coverage Δ
uitests 4.00% <ø> (ø)
unittests 16.99% <61.53%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 13604

@sureshanaparti sureshanaparti added this to the 4.20.2 milestone Jun 5, 2025
@weizhouapache
Copy link
Member

@blueorangutan package

@blueorangutan
Copy link

@weizhouapache a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✖️ el8 ✖️ el9 ✔️ debian ✖️ suse15. SL-JID 14956

@weizhouapache
Copy link
Member

@blueorangutan package

@blueorangutan
Copy link

@weizhouapache a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 14969

@weizhouapache weizhouapache modified the milestones: 4.20.2, 4.20.3 Sep 12, 2025
@DaanHoogland
Copy link
Contributor

@blueorangutan package

@blueorangutan
Copy link

@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@DaanHoogland
Copy link
Contributor

@winterhazel , is this still relevant for you? (do we need to push through on this?)

@winterhazel
Copy link
Member Author

@winterhazel , is this still relevant for you? (do we need to push through on this?)

@DaanHoogland yup, still relevant. Would be nice having this one merged.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✖️ debian ✔️ suse15. SL-JID 16022

Copy link
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16029

@RosiKyu
Copy link
Collaborator

RosiKyu commented Jan 26, 2026

WIP

TC1: Cold Volume Migration Without Secondary Storage Selector

Objective
Verify that cold volume migration uses the secondary storage with the most free capacity when no secondary storage selector (heuristic rule) is configured for VOLUME type.

Test Steps

  1. Verified no secondary storage selector exists for VOLUME type in the zone
  2. Confirmed three secondary storages exist with equal capacity (~1.08 TB free each)
  3. Started tail on management-server.log to capture secondary storage selection
  4. Executed cold migration of test-vol-1 from primary storage pri2 to pri1
  5. Verified migration completed successfully and observed log output

Expected Result:

  • Migration should complete successfully
  • Log should show: "Secondary storage selector did not direct volume migration to a specific secondary storage; using secondary storage with the most free capacity."
  • System should select sec1 (first storage with most/equal free capacity)

Actual Result:

  • Migration completed successfully ✓
  • Volume test-vol-1 migrated from pri2 to pri1 ✓
  • Log confirmed: "Secondary storage selector did not direct volume migration to a specific secondary storage; using secondary storage with the most free capacity." ✓
  • System used sec1 (id:1, uuid: 27cfbb80-8eda-4403-8ac8-9572c7edd2b7) as staging storage ✓

Test Evidence:

Pre-migration check - No VOLUME selector exists:

(localcloud) 🐱 > list secondarystorageselectors zoneid=0ec45e01-e0b0-4fbf-a6e8-7bb81dd480e2 type=VOLUME
(localcloud) 🐱 >

Secondary storage capacity (all equal):

(localcloud) 🐱 > list imageStores zoneid=0ec45e01-e0b0-4fbf-a6e8-7bb81dd480e2
{
  "count": 3,
  "imagestore": [
    {
      "disksizetotal": 2898029182976,
      "disksizeused": 1707413078016,
      "id": "27cfbb80-8eda-4403-8ac8-9572c7edd2b7",
      "name": "NFS://10.0.32.4/acs/secondary/ref-trl-10723-k-Mol9-rositsa-kyuchukova/ref-trl-10723-k-Mol9-rositsa-kyuchukova-sec1",
      "protocol": "nfs"
    },
    {
      "disksizetotal": 2898029182976,
      "disksizeused": 1707413078016,
      "id": "6062156f-b9f3-417a-bdd9-202fc5bce258",
      "name": "NFS://10.0.32.4/acs/secondary/ref-trl-10723-k-Mol9-rositsa-kyuchukova/ref-trl-10723-k-Mol9-rositsa-kyuchukova-sec2",
      "protocol": "nfs"
    },
    {
      "disksizetotal": 2898029182976,
      "disksizeused": 1707413078016,
      "id": "910cc03a-0b37-4a4d-a3f6-968cff43c3c2",
      "name": "NFS://10.0.32.4/acs/secondary/ref-trl-10723-k-Mol9-rositsa-kyuchukova/ref-trl-10723-k-Mol9-rositsa-kyuchukova-sec3",
      "protocol": "nfs"
    }
  ]
}

Migration command and result:

(localcloud) 🐱 > migrate volume volumeid=08caf0fa-ba34-43aa-addd-f415854c854a storageid=f55f0783-0351-329c-bd4f-8a9b81e4acbd
{
  "volume": {
    "id": "08caf0fa-ba34-43aa-addd-f415854c854a",
    "name": "test-vol-1",
    "state": "Ready",
    "storage": "ref-trl-10723-k-Mol9-rositsa-kyuchukova-kvm-pri1",
    "storageid": "f55f0783-0351-329c-bd4f-8a9b81e4acbd"
  }
}

Management server log showing secondary storage selection:

2026-01-26 16:59:21,409 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Work-Job-Executor-7:[ctx-ae1d605d, job-46/job-47, ctx-cd5b4539]) (logid:99a6f633) copyAsync inspecting src type VOLUME copyAsync inspecting dest type VOLUME
2026-01-26 16:59:21,409 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Work-Job-Executor-7:[ctx-ae1d605d, job-46/job-47, ctx-cd5b4539]) (logid:99a6f633) About to MIGRATE copy between datasources
2026-01-26 16:59:21,410 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Work-Job-Executor-7:[ctx-ae1d605d, job-46/job-47, ctx-cd5b4539]) (logid:99a6f633) MIGRATE copy using copyVolumeBetweenPools STARTING
2026-01-26 16:59:21,414 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Work-Job-Executor-7:[ctx-ae1d605d, job-46/job-47, ctx-cd5b4539]) (logid:99a6f633) Secondary storage selector did not direct volume migration to a specific secondary storage; using secondary storage with the most free capacity.
2026-01-26 16:59:21,417 DEBUG [c.c.s.StatsCollector] (Work-Job-Executor-7:[ctx-ae1d605d, job-46/job-47, ctx-cd5b4539]) (logid:99a6f633) Verifying image storage [ImageStore {"id":1,"name":"NFS:\/\/10.0.32.4\/acs\/secondary\/ref-trl-10723-k-Mol9-rositsa-kyuchukova\/ref-trl-10723-k-Mol9-rositsa-kyuchukova-sec1","uuid":"27cfbb80-8eda-4403-8ac8-9572c7edd2b7"}]. Capacity: total=[2.6357 TB], used=[1.5535 TB], threshold=[95.00%].
2026-01-26 16:59:22,732 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Work-Job-Executor-7:[ctx-ae1d605d, job-46/job-47, ctx-cd5b4539]) (logid:99a6f633) MIGRATE copy using copyVolumeBetweenPools DONE: true

Status: PASSED

@RosiKyu
Copy link
Collaborator

RosiKyu commented Jan 26, 2026

@blueorangutan package

@blueorangutan
Copy link

@RosiKyu a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16539

@winterhazel
Copy link
Member Author

Hey @RosiKyu, thanks for your tests! I would just like to point out that the JS interpreter is not working as intended at the current moment (see #12515). Hence, selectors that choose a secondary storage based on information about the volume/account/domain/existing secondary storages will not work as expected.

You can, however, test this PR by using a simple rule that directs all volumes to a specific secondary storage, for instance:

(admin) 🐱 > create secondarystorageselector name="direct volumes to secondary storage X" description="directs volumes to secondary storage X" zoneid=13b319e9-108c-4925-96aa-ae556d9a11b2 heuristicrule="'<uuid-of-secondary-storage-X>'" type=VOLUME

With this selector, all volumes will pass through secondary storage X during cold migration.

@RosiKyu
Copy link
Collaborator

RosiKyu commented Jan 26, 2026

Testing of PR #10957 is blocked due to a pre-existing bug in the 4.20 branch introduced by commit 03a4b9f ("server,utils: improve js interpretation functionality"). This bug is not caused by PR #10957.

@RosiKyu
Copy link
Collaborator

RosiKyu commented Jan 26, 2026

Hey @RosiKyu, thanks for your tests! I would just like to point out that the JS interpreter is not working as intended at the current moment (see #12515). Hence, selectors that choose a secondary storage based on information about the volume/account/domain/existing secondary storages will not work as expected.

You can, however, test this PR by using a simple rule that directs all volumes to a specific secondary storage, for instance:

(admin) 🐱 > create secondarystorageselector name="direct volumes to secondary storage X" description="directs volumes to secondary storage X" zoneid=13b319e9-108c-4925-96aa-ae556d9a11b2 heuristicrule="'<uuid-of-secondary-storage-X>'" type=VOLUME

With this selector, all volumes will pass through secondary storage X during cold migration.

Thanks @winterhazel for the clarification! I was hitting exactly that issue - when enabling js.interpretation.enabled=true on the current 4.20 branch (required for createSecondaryStorageSelector API), the management server hangs during startup at the module loading phase. Have logged it here, before seeing your comment: #12523

Good to know PR #12515 addresses this.

I'll proceed with testing PR #10957 using the simple rule workaround you suggested:

@winterhazel
Copy link
Member Author

@RosiKyu I think the issue you are facing is not related to #12515. Instead, it may be happening because the value of js.interpretation.enabled is encrypted in the database, but you are setting it to a decrypted value. This way, an exception is thrown when the Management Server attempts to decrypt the already decrypted value.

Could you check if the following resolves your issue?

  1. Obtain the key used by the Management Server to encrypt the configurations.
cat /etc/cloudstack/management/key
  1. Use the EncryptionCLI package from cloud-utils to encrypt the value true using this key.
java -classpath /usr/share/cloudstack-common/lib/cloudstack-utils.jar com.cloud.utils.crypt.EncryptionCLI -p <key of the management server> -i true
  1. Update the setting to the encrypted value.
mysql -u root -p cloud -e "UPDATE configuration SET value='<result of the previous command>' WHERE name='js.interpretation.enabled';"
  1. Restart the Management Server

@weizhouapache
Copy link
Member

@RosiKyu I think the issue you are facing is not related to #12515. Instead, it may be happening because the value of js.interpretation.enabled is encrypted in the database, but you are setting it to a decrypted value. This way, an exception is thrown when the Management Server attempts to decrypt the already decrypted value.

Could you check if the following resolves your issue?

  1. Obtain the key used by the Management Server to encrypt the configurations.
cat /etc/cloudstack/management/key
  1. Use the EncryptionCLI package from cloud-utils to encrypt the value true using this key.
java -classpath /usr/share/cloudstack-common/lib/cloudstack-utils.jar com.cloud.utils.crypt.EncryptionCLI -p <key of the management server> -i true
  1. Update the setting to the encrypted value.
mysql -u root -p cloud -e "UPDATE configuration SET value='<result of the previous command>' WHERE name='js.interpretation.enabled';"
  1. Restart the Management Server

or, just copy the value of configurtion "init"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants