NE-2561: Add Gateway API OLM to NO-OLM migration upgrade test#30897
NE-2561: Add Gateway API OLM to NO-OLM migration upgrade test#30897gcs278 wants to merge 2 commits intoopenshift:mainfrom
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
@gcs278: This pull request references NE-2292 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Skipping CI for Draft Pull Request. |
WalkthroughThis change introduces Gateway API upgrade testing capabilities by adding a new Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.11.3)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
|
/test e2e-gcp-ovn-upgrade |
|
/test ? |
90dbe8d to
ac9d301
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: gcs278 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/test e2e-gcp-ovn-upgrade |
b08e876 to
4df0775
Compare
|
/test e2e-gcp-ovn-upgrade |
|
Risk analysis has seen new tests most likely introduced by this PR. New Test Risks for sha: 4df0775
New tests seen in this PR at sha: 4df0775
|
|
/test ? |
|
/payload list |
|
/payload-job periodic-ci-openshift-release-main-ci-4.22-upgrade-from-stable-4.21-e2e-gcp-ovn-upgrade |
|
@gcs278: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/60c65dd0-22f9-11f1-91b0-a3e2ef55f2e7-0 |
|
Ah darn this isn't going to work because |
|
@gcs278: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c6220410-2306-11f1-9c3a-90af7c7cc380-0 |
4df0775 to
5b67673
Compare
|
[trying again] Okay, this is a bit crazy, but I created a draft promotion PR (openshift/api#2772) so that we can test with NoOLM as default since we don't have /payload-job-with-prs periodic-ci-openshift-release-main-ci-4.22-upgrade-from-stable-4.21-e2e-gcp-ovn-upgrade openshift/api#2772 openshift/cluster-ingress-operator#1354 Additionally, this should still run for OLM to OLM, or noOLM to noOLM (it's a generic z stream upgrade test): This would be noOLM to noOLM: So there's our 3 cases:
|
|
@gcs278: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/23e6af60-2307-11f1-992d-12ede9d2e079-0 |
|
@gcs278: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/26a088fc-2307-11f1-9da4-6bca82f01e76-0 |
2c98e57 to
3b1447c
Compare
|
/payload-abort |
|
/payload-job-with-prs periodic-ci-openshift-release-main-ci-4.22-upgrade-from-stable-4.21-e2e-gcp-ovn-upgrade openshift/api#2772 openshift/cluster-ingress-operator#1393 |
|
@gcs278: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/8a761690-24c8-11f1-9c74-da2cb51d6692-0 |
|
/test e2e-gcp-ovn-upgrade |
|
/testwith openshift/origin/main/e2e-gcp-ovn-upgrade openshift/api#2772 openshift/cluster-ingress-operator#1393 |
|
Risk analysis has seen new tests most likely introduced by this PR. New Test Risks for sha: 3b1447c
New tests seen in this PR at sha: 3b1447c
|
3b1447c to
9232301
Compare
|
@gcs278: This pull request references NE-2561 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Add upgrade test validating Gateway API migration from OLM-based Istio to CIO-managed Sail Library during 4.21 to 4.22 upgrades. Setup creates Gateway/HTTPRoute with OLM provisioning and tests connectivity. Test validates migration: Gateway remains programmed, Istiod running, Istio CRDs stay OLM-managed, GatewayClass has CIO finalizer, Istio CR deleted, subscription persists. Teardown cleans up all resources.
Previously, HttpRoute test used a pod, but a pod won't survive a reboot during the upgrade process.
9232301 to
72acd6d
Compare
|
/testwith openshift/origin/main/e2e-gcp-ovn-upgrade openshift/api#2772 openshift/cluster-ingress-operator#1393 |
|
/payload-job-with-prs periodic-ci-openshift-release-main-ci-4.22-upgrade-from-stable-4.21-e2e-gcp-ovn-upgrade openshift/api#2772 openshift/cluster-ingress-operator#1393 |
|
@gcs278: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/6d5568f0-2710-11f1-9d03-561b26d4f2f7-0 |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
test/extended/router/gatewayapicontroller.go (1)
1330-1365: Consider checking pod readiness conditions instead of just phase.The current check only verifies
pod.Status.Phase == corev1.PodRunning, but a pod can be in Running phase without all containers being ready. For more robust validation, consider checking container readiness conditions.Suggested enhancement
for _, pod := range pods.Items { - if pod.Status.Phase == corev1.PodRunning { - return true, nil + if pod.Status.Phase == corev1.PodRunning { + for _, cond := range pod.Status.Conditions { + if cond.Type == corev1.PodReady && cond.Status == corev1.ConditionTrue { + return true, nil + } + } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/extended/router/gatewayapicontroller.go` around lines 1330 - 1365, The check in checkIstiodRunning currently treats a pod as ready if pod.Status.Phase == corev1.PodRunning; update the Pod loop to instead verify container readiness (e.g., inspect pod.Status.ContainerStatuses and ensure at least one pod has all its containerStatuses[*].Ready == true or use the PodCondition Ready == True) before returning true, leaving the existing wait.PollUntilContextTimeout flow intact and keeping the same logging paths (e.g., the e2e.Logf messages) when no ready pod is found; ensure error handling for the List call remains unchanged.test/extended/router/gatewayapi_upgrade.go (1)
231-294: Cleanup order: Consider deleting Gateway before GatewayClass.The current teardown deletes GatewayClass first (line 235), then Gateway later (line 263). Typically, dependent resources (Gateway) should be deleted before the parent (GatewayClass) to allow proper finalizer processing. While this may work due to
--ignore-not-foundand polling, reversing the order would be cleaner.Suggested reordering
func (t *GatewayAPIUpgradeTest) Teardown(ctx context.Context, f *framework.Framework) { + g.By("Deleting the Gateway") + err := t.oc.AdminGatewayApiClient().GatewayV1().Gateways(ingressNamespace).Delete(ctx, t.gatewayName, metav1.DeleteOptions{}) + if err != nil && !apierrors.IsNotFound(err) { + framework.Logf("Failed to delete Gateway %q: %v", t.gatewayName, err) + } + g.By("Deleting the GatewayClass") err := t.oc.AdminGatewayApiClient().GatewayV1().GatewayClasses().Delete(ctx, gatewayClassName, metav1.DeleteOptions{}) // ... rest of cleanup ... - - g.By("Deleting the Gateway") - err = t.oc.AdminGatewayApiClient().GatewayV1().Gateways(ingressNamespace).Delete(ctx, t.gatewayName, metav1.DeleteOptions{}) - if err != nil && !apierrors.IsNotFound(err) { - framework.Logf("Failed to delete Gateway %q: %v", t.gatewayName, err) - }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/extended/router/gatewayapi_upgrade.go` around lines 231 - 294, In Teardown, delete the Gateway resource before deleting the GatewayClass: move the block that calls t.oc.AdminGatewayApiClient().GatewayV1().Gateways(ingressNamespace).Delete(ctx, t.gatewayName, metav1.DeleteOptions{}) (and its error handling) to run prior to the block that deletes GatewayClasses (the call deleting gatewayClassName), and optionally add a short wait/poll (similar to the istiod pods wait) to ensure the Gateway is fully removed before calling GatewayClasses().Delete; keep existing error handling and ignore-not-found checks for both gateway and gatewayClass.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/extended/router/gatewayapi_upgrade.go`:
- Around line 231-294: In Teardown, delete the Gateway resource before deleting
the GatewayClass: move the block that calls
t.oc.AdminGatewayApiClient().GatewayV1().Gateways(ingressNamespace).Delete(ctx,
t.gatewayName, metav1.DeleteOptions{}) (and its error handling) to run prior to
the block that deletes GatewayClasses (the call deleting gatewayClassName), and
optionally add a short wait/poll (similar to the istiod pods wait) to ensure the
Gateway is fully removed before calling GatewayClasses().Delete; keep existing
error handling and ignore-not-found checks for both gateway and gatewayClass.
In `@test/extended/router/gatewayapicontroller.go`:
- Around line 1330-1365: The check in checkIstiodRunning currently treats a pod
as ready if pod.Status.Phase == corev1.PodRunning; update the Pod loop to
instead verify container readiness (e.g., inspect pod.Status.ContainerStatuses
and ensure at least one pod has all its containerStatuses[*].Ready == true or
use the PodCondition Ready == True) before returning true, leaving the existing
wait.PollUntilContextTimeout flow intact and keeping the same logging paths
(e.g., the e2e.Logf messages) when no ready pod is found; ensure error handling
for the List call remains unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1f8825ca-bbd1-47dd-bc76-38e9fec0eaa6
📒 Files selected for processing (3)
test/e2e/upgrade/upgrade.gotest/extended/router/gatewayapi_upgrade.gotest/extended/router/gatewayapicontroller.go
|
Scheduling required tests: Scheduling tests matching the |
|
@gcs278: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
Risk analysis has seen new tests most likely introduced by this PR. New tests seen in this PR at sha: 72acd6d
|
1 similar comment
|
Risk analysis has seen new tests most likely introduced by this PR. New tests seen in this PR at sha: 72acd6d
|
|
All variations of the upgrade test passed! |
|
@gcs278: This pull request references NE-2561 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@gcs278: This pull request references NE-2561 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Add upgrade test validating Gateway API migration from OLM-based Istio to CIO-managed Sail Library during 4.21 to 4.22 upgrades.
Setup creates Gateway/HTTPRoute with OLM provisioning and tests connectivity. Test validates migration: Gateway remains programmed, Istiod running, Istio CRDs stay OLM-managed, GatewayClass has CIO finalizer, Istio CR deleted, subscription persists. Teardown cleans up all resources.
Note: While this isn't strictly blocked on noOLM GA promotion (openshift/api#2772), if it merges before it, it will only test OLM --> OLM, and noOLM --> noOLM as TechPreview cannot be upgraded (upgrade CI jobs don't exist for TP).
CC: @rhamini3