Skip to content

AGENT-1449: Add single-phase IRI registry credential rotation#5810

Open
rwsu wants to merge 4 commits intoopenshift:mainfrom
rwsu:AGENT-1449-auth-rotation-simple
Open

AGENT-1449: Add single-phase IRI registry credential rotation#5810
rwsu wants to merge 4 commits intoopenshift:mainfrom
rwsu:AGENT-1449-auth-rotation-simple

Conversation

@rwsu
Copy link
Copy Markdown

@rwsu rwsu commented Mar 26, 2026

- What I did

Implement credential rotation that accepts brief registry downtime.

When an admin updates iriAuthSecret.Data["password"], the controller:

  1. Detects the mismatch between password and htpasswd (via bcrypt compare)
  2. Generates a new bcrypt hash and updates iriAuthSecret.Data["htpasswd"]
  3. Re-renders the master MachineConfig with the new htpasswd
  4. Updates the global pull secret with the new credentials
  5. MCD rolls out the updated MC; brief downtime for IRI registry during
    rollout is accepted

Key changes:

  • Add NoneStatusAction for /etc/iri-registry/auth/htpasswd in NodeDisruptionPolicy
    (Distribution registry re-reads htpasswd on mtime change, no restart needed)
  • Add unit tests for helpers and reconcileAuthSecret
  • Add e2e tests: unauthenticated 401, authenticated 200, and full rotation flow
    (tests use ExecCmdOnNode via MCD pod to reach api-int:22625 in CI)

- How to verify it

Update the password to trigger the rotation to start:

oc -n openshift-machine-config-operator patch secret internal-release-image-registry-auth \
  --type merge -p '{"data":{"password":"'$(echo -n "new-password" | base64)'"}}'

Verify the /etc/iri-registry/auth/htpasswd has been updated.
Verify iri-registry works new credentials after rollout is complete.
Verify global pull-secret contains the new credentials after rollout is complete.

- Description for the changelog

Support credential rotation in IRI registry.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 26, 2026

Walkthrough

Adds htpasswd-based IRI registry authentication: generation/reconciliation of auth secrets, renderer/template changes to deploy the htpasswd file, merging credentials into the global pull secret, and unit/e2e tests exercising auth and rotation.

Changes

Cohort / File(s) Summary
API / Constants
pkg/apihelpers/apihelpers.go, pkg/controller/common/constants.go
Inserted default node file policy entry for /etc/iri-registry/auth/htpasswd and added exported constant InternalReleaseImageAuthSecretName.
Bootstrap plumbing
pkg/controller/bootstrap/bootstrap.go, pkg/controller/internalreleaseimage/internalreleaseimage_bootstrap.go, pkg/controller/internalreleaseimage/internalreleaseimage_bootstrap_test.go
Threaded an optional IRI auth Secret through bootstrap code and tests; feature-gated merge of IRI auth into pull secret during bootstrap.
Controller: reconcile & sync
pkg/controller/internalreleaseimage/internalreleaseimage_controller.go, pkg/controller/internalreleaseimage/internalreleaseimage_controller_test.go
Added kubeClient, watch/filter on IRI auth secret, reconcileAuthSecret to ensure htpasswd matches password, and mergeIRIAuthIntoPullSecret to update openshift-config/global-pull-secret; updated tests and added verification hooks.
Renderer & templates
pkg/controller/internalreleaseimage/internalreleaseimage_renderer.go, pkg/controller/internalreleaseimage/templates/master/.../iri-registry-auth-htpasswd.yaml, pkg/controller/internalreleaseimage/templates/master/.../usr-local-bin-load-registry-image-sh.yaml, pkg/controller/internalreleaseimage/templates/master/units/iri-registry.service.yaml
Renderer accepts iriAuthSecret and exposes .IriHtpasswd; added htpasswd file template; made unit and podman pull templates conditional on auth presence and use authfile for podman.
Pull-secret & htpasswd utils
pkg/controller/internalreleaseimage/pullsecret.go, pkg/controller/internalreleaseimage/pullsecret_test.go
New helpers: GenerateHtpasswdEntry, HtpasswdMatchesPassword, and MergeIRIAuthIntoPullSecret plus comprehensive unit tests covering merge paths and htpasswd behavior.
Test helpers / ignition checks
pkg/controller/internalreleaseimage/internalreleaseimage_helpers_test.go
Added test helpers for iriAuthSecret, pullSecret, DNS builder, verification that ignition contains htpasswd and service config, and updated verification helpers.
E2E tests
test/e2e-iri/iri_test.go
New e2e suite to validate IRI registry auth (401 unauthenticated, 200 authenticated) and credential rotation across cluster rollout.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@rwsu rwsu changed the title Agent 1449 auth rotation simple AGENT-1449: Add single-phase IRI registry credential rotation Mar 26, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 26, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Mar 26, 2026

@rwsu: This pull request references AGENT-1449 which is a valid jira issue.

Details

In response to this:

- What I did

Implement credential rotation that accepts brief registry downtime.

When an admin updates iriAuthSecret.Data["password"], the controller:

  1. Detects the mismatch between password and htpasswd (via bcrypt compare)
  2. Generates a new bcrypt hash and updates iriAuthSecret.Data["htpasswd"]
  3. Re-renders the master MachineConfig with the new htpasswd
  4. Updates the global pull secret with the new credentials
  5. MCD rolls out the updated MC; brief downtime for IRI registry during
    rollout is accepted

Key changes:

  • Add NoneStatusAction for /etc/iri-registry/auth/htpasswd in NodeDisruptionPolicy
    (Distribution registry re-reads htpasswd on mtime change, no restart needed)
  • Add unit tests for helpers and reconcileAuthSecret
  • Add e2e tests: unauthenticated 401, authenticated 200, and full rotation flow
    (tests use ExecCmdOnNode via MCD pod to reach api-int:22625 in CI)

- How to verify it

Update the password to trigger the rotation to start:

oc -n openshift-machine-config-operator patch secret internal-release-image-registry-auth
--type merge -p '{"data":{"password":"'$(echo -n "new-password" | base64)'"}}'
Verify the /etc/iri-registry/auth/htpasswd has been updated.
Verify iri-registry works new credentials after rollout is complete.
Verify global pull-secret contains the new credentials after rollout is complete.

- Description for the changelog

Add credential rotation support for the IRI registry.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 26, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: rwsu
Once this PR has been reviewed and has the lgtm label, please assign pablintino for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/controller/internalreleaseimage/internalreleaseimage_controller.go`:
- Around line 478-510: The code in mergeIRIAuthIntoPullSecret reads
cconfig.Spec.DNS.Spec.BaseDomain without guarding for a nil DNS pointer; add a
nil check at the start of mergeIRIAuthIntoPullSecret to handle missing DNS
(e.g., if cconfig.Spec.DNS == nil) and return a clear error (or fallback
behavior) instead of dereferencing; update references to baseDomain to use the
validated value so the function never panics when ControllerConfig has no DNS
configured.

In `@pkg/controller/internalreleaseimage/internalreleaseimage_renderer.go`:
- Around line 121-122: r.iriAuthSecret may be nil but the code unconditionally
reads r.iriAuthSecret.Data["htpasswd"] into iriHtpasswd; add a nil guard before
that access (e.g., check if r.iriAuthSecret != nil) and handle the nil case
explicitly — either return an error from the renderer function or set
iriHtpasswd to a safe default and log/propagate the missing secret; update the
struct comment only if you change the invariant to make iriAuthSecret required.
Ensure you modify the code paths that use iriHtpasswd to handle the new
nil/empty case consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 33ab74c5-20ae-4aee-8f24-35db4c8537b4

📥 Commits

Reviewing files that changed from the base of the PR and between 33e9c33 and 5bf9b3c.

⛔ Files ignored due to path filters (3)
  • vendor/golang.org/x/crypto/bcrypt/base64.go is excluded by !vendor/**, !**/vendor/**
  • vendor/golang.org/x/crypto/bcrypt/bcrypt.go is excluded by !vendor/**, !**/vendor/**
  • vendor/modules.txt is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (15)
  • pkg/apihelpers/apihelpers.go
  • pkg/controller/bootstrap/bootstrap.go
  • pkg/controller/common/constants.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_bootstrap.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_bootstrap_test.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_controller.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_controller_test.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_helpers_test.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_renderer.go
  • pkg/controller/internalreleaseimage/pullsecret.go
  • pkg/controller/internalreleaseimage/pullsecret_test.go
  • pkg/controller/internalreleaseimage/templates/master/files/iri-registry-auth-htpasswd.yaml
  • pkg/controller/internalreleaseimage/templates/master/files/usr-local-bin-load-registry-image-sh.yaml
  • pkg/controller/internalreleaseimage/templates/master/units/iri-registry.service.yaml
  • test/e2e-iri/iri_test.go

Comment on lines +478 to +510
func (ctrl *Controller) mergeIRIAuthIntoPullSecret(cconfig *mcfgv1.ControllerConfig, authSecret *corev1.Secret) error {
password := string(authSecret.Data["password"])
if password == "" {
return fmt.Errorf("IRI auth secret %s/%s has empty password", authSecret.Namespace, authSecret.Name)
}

baseDomain := cconfig.Spec.DNS.Spec.BaseDomain

// Fetch current pull secret from openshift-config
pullSecret, err := ctrl.kubeClient.CoreV1().Secrets(ctrlcommon.OpenshiftConfigNamespace).Get(
context.TODO(), ctrlcommon.GlobalPullSecretName, metav1.GetOptions{})
if err != nil {
return fmt.Errorf("could not get pull-secret: %w", err)
}

mergedBytes, err := MergeIRIAuthIntoPullSecret(pullSecret.Data[corev1.DockerConfigJsonKey], password, baseDomain)
if err != nil {
return err
}

// No change needed
if bytes.Equal(mergedBytes, pullSecret.Data[corev1.DockerConfigJsonKey]) {
return nil
}

pullSecret.Data[corev1.DockerConfigJsonKey] = mergedBytes
_, err = ctrl.kubeClient.CoreV1().Secrets(ctrlcommon.OpenshiftConfigNamespace).Update(
context.TODO(), pullSecret, metav1.UpdateOptions{})
if err == nil {
klog.Infof("Updated pull secret with IRI registry auth credentials from secret %s/%s (uid=%s, resourceVersion=%s)", authSecret.Namespace, authSecret.Name, authSecret.UID, authSecret.ResourceVersion)
}
return err
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential nil pointer dereference on cconfig.Spec.DNS.

Line 484 accesses cconfig.Spec.DNS.Spec.BaseDomain without checking if DNS is nil. If the ControllerConfig doesn't have DNS configured, this will panic.

🛡️ Proposed fix: add nil guard
 func (ctrl *Controller) mergeIRIAuthIntoPullSecret(cconfig *mcfgv1.ControllerConfig, authSecret *corev1.Secret) error {
 	password := string(authSecret.Data["password"])
 	if password == "" {
 		return fmt.Errorf("IRI auth secret %s/%s has empty password", authSecret.Namespace, authSecret.Name)
 	}

+	if cconfig.Spec.DNS == nil {
+		return fmt.Errorf("ControllerConfig DNS not configured, cannot determine IRI registry host")
+	}
 	baseDomain := cconfig.Spec.DNS.Spec.BaseDomain
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (ctrl *Controller) mergeIRIAuthIntoPullSecret(cconfig *mcfgv1.ControllerConfig, authSecret *corev1.Secret) error {
password := string(authSecret.Data["password"])
if password == "" {
return fmt.Errorf("IRI auth secret %s/%s has empty password", authSecret.Namespace, authSecret.Name)
}
baseDomain := cconfig.Spec.DNS.Spec.BaseDomain
// Fetch current pull secret from openshift-config
pullSecret, err := ctrl.kubeClient.CoreV1().Secrets(ctrlcommon.OpenshiftConfigNamespace).Get(
context.TODO(), ctrlcommon.GlobalPullSecretName, metav1.GetOptions{})
if err != nil {
return fmt.Errorf("could not get pull-secret: %w", err)
}
mergedBytes, err := MergeIRIAuthIntoPullSecret(pullSecret.Data[corev1.DockerConfigJsonKey], password, baseDomain)
if err != nil {
return err
}
// No change needed
if bytes.Equal(mergedBytes, pullSecret.Data[corev1.DockerConfigJsonKey]) {
return nil
}
pullSecret.Data[corev1.DockerConfigJsonKey] = mergedBytes
_, err = ctrl.kubeClient.CoreV1().Secrets(ctrlcommon.OpenshiftConfigNamespace).Update(
context.TODO(), pullSecret, metav1.UpdateOptions{})
if err == nil {
klog.Infof("Updated pull secret with IRI registry auth credentials from secret %s/%s (uid=%s, resourceVersion=%s)", authSecret.Namespace, authSecret.Name, authSecret.UID, authSecret.ResourceVersion)
}
return err
}
func (ctrl *Controller) mergeIRIAuthIntoPullSecret(cconfig *mcfgv1.ControllerConfig, authSecret *corev1.Secret) error {
password := string(authSecret.Data["password"])
if password == "" {
return fmt.Errorf("IRI auth secret %s/%s has empty password", authSecret.Namespace, authSecret.Name)
}
if cconfig.Spec.DNS == nil {
return fmt.Errorf("ControllerConfig DNS not configured, cannot determine IRI registry host")
}
baseDomain := cconfig.Spec.DNS.Spec.BaseDomain
// Fetch current pull secret from openshift-config
pullSecret, err := ctrl.kubeClient.CoreV1().Secrets(ctrlcommon.OpenshiftConfigNamespace).Get(
context.TODO(), ctrlcommon.GlobalPullSecretName, metav1.GetOptions{})
if err != nil {
return fmt.Errorf("could not get pull-secret: %w", err)
}
mergedBytes, err := MergeIRIAuthIntoPullSecret(pullSecret.Data[corev1.DockerConfigJsonKey], password, baseDomain)
if err != nil {
return err
}
// No change needed
if bytes.Equal(mergedBytes, pullSecret.Data[corev1.DockerConfigJsonKey]) {
return nil
}
pullSecret.Data[corev1.DockerConfigJsonKey] = mergedBytes
_, err = ctrl.kubeClient.CoreV1().Secrets(ctrlcommon.OpenshiftConfigNamespace).Update(
context.TODO(), pullSecret, metav1.UpdateOptions{})
if err == nil {
klog.Infof("Updated pull secret with IRI registry auth credentials from secret %s/%s (uid=%s, resourceVersion=%s)", authSecret.Namespace, authSecret.Name, authSecret.UID, authSecret.ResourceVersion)
}
return err
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/internalreleaseimage/internalreleaseimage_controller.go`
around lines 478 - 510, The code in mergeIRIAuthIntoPullSecret reads
cconfig.Spec.DNS.Spec.BaseDomain without guarding for a nil DNS pointer; add a
nil check at the start of mergeIRIAuthIntoPullSecret to handle missing DNS
(e.g., if cconfig.Spec.DNS == nil) and return a clear error (or fallback
behavior) instead of dereferencing; update references to baseDomain to use the
validated value so the function never panics when ControllerConfig has no DNS
configured.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Mar 26, 2026

@rwsu: This pull request references AGENT-1449 which is a valid jira issue.

Details

In response to this:

- What I did

Implement credential rotation that accepts brief registry downtime.

When an admin updates iriAuthSecret.Data["password"], the controller:

  1. Detects the mismatch between password and htpasswd (via bcrypt compare)
  2. Generates a new bcrypt hash and updates iriAuthSecret.Data["htpasswd"]
  3. Re-renders the master MachineConfig with the new htpasswd
  4. Updates the global pull secret with the new credentials
  5. MCD rolls out the updated MC; brief downtime for IRI registry during
    rollout is accepted

Key changes:

  • Add NoneStatusAction for /etc/iri-registry/auth/htpasswd in NodeDisruptionPolicy
    (Distribution registry re-reads htpasswd on mtime change, no restart needed)
  • Add unit tests for helpers and reconcileAuthSecret
  • Add e2e tests: unauthenticated 401, authenticated 200, and full rotation flow
    (tests use ExecCmdOnNode via MCD pod to reach api-int:22625 in CI)

- How to verify it

Update the password to trigger the rotation to start:

oc -n openshift-machine-config-operator patch secret internal-release-image-registry-auth \
 --type merge -p '{"data":{"password":"'$(echo -n "new-password" | base64)'"}}'

Verify the /etc/iri-registry/auth/htpasswd has been updated.
Verify iri-registry works new credentials after rollout is complete.
Verify global pull-secret contains the new credentials after rollout is complete.

- Description for the changelog

Add credential rotation support for the IRI registry.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Mar 26, 2026

@rwsu: This pull request references AGENT-1449 which is a valid jira issue.

Details

In response to this:

- What I did

Implement credential rotation that accepts brief registry downtime.

When an admin updates iriAuthSecret.Data["password"], the controller:

  1. Detects the mismatch between password and htpasswd (via bcrypt compare)
  2. Generates a new bcrypt hash and updates iriAuthSecret.Data["htpasswd"]
  3. Re-renders the master MachineConfig with the new htpasswd
  4. Updates the global pull secret with the new credentials
  5. MCD rolls out the updated MC; brief downtime for IRI registry during
    rollout is accepted

Key changes:

  • Add NoneStatusAction for /etc/iri-registry/auth/htpasswd in NodeDisruptionPolicy
    (Distribution registry re-reads htpasswd on mtime change, no restart needed)
  • Add unit tests for helpers and reconcileAuthSecret
  • Add e2e tests: unauthenticated 401, authenticated 200, and full rotation flow
    (tests use ExecCmdOnNode via MCD pod to reach api-int:22625 in CI)

- How to verify it

Update the password to trigger the rotation to start:

oc -n openshift-machine-config-operator patch secret internal-release-image-registry-auth \
 --type merge -p '{"data":{"password":"'$(echo -n "new-password" | base64)'"}}'

Verify the /etc/iri-registry/auth/htpasswd has been updated.
Verify iri-registry works new credentials after rollout is complete.
Verify global pull-secret contains the new credentials after rollout is complete.

- Description for the changelog

Adds credential rotation support for the IRI registry.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Mar 26, 2026

@rwsu: This pull request references AGENT-1449 which is a valid jira issue.

Details

In response to this:

- What I did

Implement credential rotation that accepts brief registry downtime.

When an admin updates iriAuthSecret.Data["password"], the controller:

  1. Detects the mismatch between password and htpasswd (via bcrypt compare)
  2. Generates a new bcrypt hash and updates iriAuthSecret.Data["htpasswd"]
  3. Re-renders the master MachineConfig with the new htpasswd
  4. Updates the global pull secret with the new credentials
  5. MCD rolls out the updated MC; brief downtime for IRI registry during
    rollout is accepted

Key changes:

  • Add NoneStatusAction for /etc/iri-registry/auth/htpasswd in NodeDisruptionPolicy
    (Distribution registry re-reads htpasswd on mtime change, no restart needed)
  • Add unit tests for helpers and reconcileAuthSecret
  • Add e2e tests: unauthenticated 401, authenticated 200, and full rotation flow
    (tests use ExecCmdOnNode via MCD pod to reach api-int:22625 in CI)

- How to verify it

Update the password to trigger the rotation to start:

oc -n openshift-machine-config-operator patch secret internal-release-image-registry-auth \
 --type merge -p '{"data":{"password":"'$(echo -n "new-password" | base64)'"}}'

Verify the /etc/iri-registry/auth/htpasswd has been updated.
Verify iri-registry works new credentials after rollout is complete.
Verify global pull-secret contains the new credentials after rollout is complete.

- Description for the changelog

Support credential rotation in IRI registry.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 26, 2026
@rwsu rwsu force-pushed the AGENT-1449-auth-rotation-simple branch from 5bf9b3c to 91b03d0 Compare March 27, 2026 03:31
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 27, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
pkg/controller/internalreleaseimage/internalreleaseimage_controller.go (1)

538-544: ⚠️ Potential issue | 🟠 Major

Potential nil pointer dereference on cconfig.Spec.DNS.

Line 544 accesses cconfig.Spec.DNS.Spec.BaseDomain without checking if DNS is nil. If the ControllerConfig doesn't have DNS configured, this will panic.

 func (ctrl *Controller) mergeIRIAuthIntoPullSecret(cconfig *mcfgv1.ControllerConfig, authSecret *corev1.Secret) error {
 	password := string(authSecret.Data["password"])
 	if password == "" {
 		return fmt.Errorf("IRI auth secret %s/%s has empty password", authSecret.Namespace, authSecret.Name)
 	}

+	if cconfig.Spec.DNS == nil {
+		return fmt.Errorf("ControllerConfig DNS not configured, cannot determine IRI registry host")
+	}
 	baseDomain := cconfig.Spec.DNS.Spec.BaseDomain
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/internalreleaseimage/internalreleaseimage_controller.go`
around lines 538 - 544, In mergeIRIAuthIntoPullSecret, avoid a nil pointer
dereference by checking cconfig.Spec.DNS before accessing
cconfig.Spec.DNS.Spec.BaseDomain; if cconfig.Spec.DNS is nil, either return a
descriptive error or use a sensible default/behavior consistent with the
controller (e.g., return fmt.Errorf("ControllerConfig missing DNS
configuration") or proceed with an empty baseDomain), then use the validated
baseDomain value for the rest of the function.
pkg/controller/internalreleaseimage/internalreleaseimage_renderer.go (1)

121-122: ⚠️ Potential issue | 🟠 Major

Nil pointer dereference risk if iriAuthSecret is nil.

Line 43 documents iriAuthSecret as "may be nil", but line 121 unconditionally accesses r.iriAuthSecret.Data["htpasswd"]. The bootstrap code path can pass nil for iriAuthSecret, which would cause a panic.

Add a nil guard before accessing the secret data:

+	var iriHtpasswd string
+	if r.iriAuthSecret != nil {
+		iriHtpasswd = string(r.iriAuthSecret.Data["htpasswd"])
+	}
-	iriHtpasswd := string(r.iriAuthSecret.Data["htpasswd"])
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/internalreleaseimage/internalreleaseimage_renderer.go` around
lines 121 - 122, The code unconditionally reads r.iriAuthSecret.Data["htpasswd"]
into iriHtpasswd which can panic because r.iriAuthSecret may be nil; update the
iriHtpasswd initialization in internalreleaseimage_renderer.go (where
iriHtpasswd is set) to first check r.iriAuthSecret != nil and only read
Data["htpasswd"] when non-nil, otherwise set iriHtpasswd to an appropriate
empty/default value or handle the nil case so no nil pointer dereference occurs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@pkg/controller/internalreleaseimage/internalreleaseimage_controller.go`:
- Around line 538-544: In mergeIRIAuthIntoPullSecret, avoid a nil pointer
dereference by checking cconfig.Spec.DNS before accessing
cconfig.Spec.DNS.Spec.BaseDomain; if cconfig.Spec.DNS is nil, either return a
descriptive error or use a sensible default/behavior consistent with the
controller (e.g., return fmt.Errorf("ControllerConfig missing DNS
configuration") or proceed with an empty baseDomain), then use the validated
baseDomain value for the rest of the function.

In `@pkg/controller/internalreleaseimage/internalreleaseimage_renderer.go`:
- Around line 121-122: The code unconditionally reads
r.iriAuthSecret.Data["htpasswd"] into iriHtpasswd which can panic because
r.iriAuthSecret may be nil; update the iriHtpasswd initialization in
internalreleaseimage_renderer.go (where iriHtpasswd is set) to first check
r.iriAuthSecret != nil and only read Data["htpasswd"] when non-nil, otherwise
set iriHtpasswd to an appropriate empty/default value or handle the nil case so
no nil pointer dereference occurs.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d91576d8-1b24-47cb-816b-8cf2eb827572

📥 Commits

Reviewing files that changed from the base of the PR and between 5bf9b3c and 91b03d0.

⛔ Files ignored due to path filters (3)
  • vendor/golang.org/x/crypto/bcrypt/base64.go is excluded by !vendor/**, !**/vendor/**
  • vendor/golang.org/x/crypto/bcrypt/bcrypt.go is excluded by !vendor/**, !**/vendor/**
  • vendor/modules.txt is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (15)
  • pkg/apihelpers/apihelpers.go
  • pkg/controller/bootstrap/bootstrap.go
  • pkg/controller/common/constants.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_bootstrap.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_bootstrap_test.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_controller.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_controller_test.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_helpers_test.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_renderer.go
  • pkg/controller/internalreleaseimage/pullsecret.go
  • pkg/controller/internalreleaseimage/pullsecret_test.go
  • pkg/controller/internalreleaseimage/templates/master/files/iri-registry-auth-htpasswd.yaml
  • pkg/controller/internalreleaseimage/templates/master/files/usr-local-bin-load-registry-image-sh.yaml
  • pkg/controller/internalreleaseimage/templates/master/units/iri-registry.service.yaml
  • test/e2e-iri/iri_test.go
✅ Files skipped from review due to trivial changes (4)
  • pkg/controller/internalreleaseimage/templates/master/files/iri-registry-auth-htpasswd.yaml
  • pkg/controller/common/constants.go
  • pkg/controller/internalreleaseimage/pullsecret_test.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_controller_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • pkg/controller/internalreleaseimage/templates/master/files/usr-local-bin-load-registry-image-sh.yaml
  • pkg/apihelpers/apihelpers.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_bootstrap.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_bootstrap_test.go

rwsu added 4 commits March 27, 2026 11:25
Add htpasswd-based authentication to the IRI registry. The installer
generates credentials and provides them via a bootstrap secret. The MCO
mounts the htpasswd file into the registry container and configures
registry auth environment variables. The registry password is merged
into the node pull secret so kubelet can authenticate when pulling
the release image.

Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
The IRI controller merges registry auth credentials into the global
pull secret after bootstrap. This triggers the template controller to
re-render template MCs (00-master, etc.) with the updated pull secret,
producing a different rendered MC hash than what bootstrap created.

The mismatch causes the MCD DaemonSet pod to fail during bootstrap:
it reads the bootstrap-rendered MC name from the node annotation, but
that MC no longer exists in-cluster (replaced by the re-rendered one).
The MCD falls back to reading /etc/machine-config-daemon/currentconfig,
which was never written because the firstboot MCD detected "no changes"
and skipped it. Both master nodes go Degraded and never recover.

Fix by merging IRI auth into the pull secret during bootstrap before
template MC rendering, so both bootstrap and in-cluster produce
identical rendered MC hashes.

Extract the pull secret merge logic into a shared MergeIRIAuthIntoPullSecret
function used by both the bootstrap path and the in-cluster IRI controller.

Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
- bootstrap.go: gate pull secret merge on iri != nil && iriAuthSecret != nil
  (restore the IRI CR check that was lost in the hash mismatch fix, drop the
  unnecessary cconfig.Spec.DNS != nil guard); treat merge failure as an error
- controller: return an error when the IRI auth secret is missing instead of
  silently ignoring it; auth is expected to always be present
- controller: remove the iriAuthSecret != nil guard around mergeIRIAuthIntoPullSecret
- mergeIRIAuthIntoPullSecret: error on empty password; remove DNS nil check
  (DNS is always set); error on pull secret not found
- load-registry-image.sh: always pass --authfile, kubelet config.json is
  always available by the time the remote pull fallback runs
- tests: assume auth is always present -- add iriAuthSecret/pullSecret/withDNS
  to all non-deletion test cases; collapse WithAuth variants into single
  functions; remove no-auth test case

Assisted-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Implement credential rotation that accepts brief registry downtime.

When an admin updates iriAuthSecret.Data["password"], the controller:
1. Detects the mismatch between password and htpasswd (via bcrypt compare)
2. Generates a new bcrypt hash and updates iriAuthSecret.Data["htpasswd"]
3. Re-renders the master MachineConfig with the new htpasswd
4. Updates the global pull secret with the new credentials
5. MCD rolls out the updated MC; brief downtime for IRI registry during
   rollout is accepted

Key changes:
- Add NoneStatusAction for /etc/iri-registry/auth/htpasswd in NodeDisruptionPolicy
  (Distribution registry re-reads htpasswd on mtime change, no restart needed)
- Add unit tests for helpers and reconcileAuthSecret
- Add e2e tests: unauthenticated 401, authenticated 200, and full rotation flow
  (tests use ExecCmdOnNode via MCD pod to reach api-int:22625 in CI)

Assisted-by: Claude Sonnet 4.6 <noreply@anthropic.com>
@rwsu rwsu force-pushed the AGENT-1449-auth-rotation-simple branch from 91b03d0 to 199098a Compare March 27, 2026 19:55
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 28, 2026

@rwsu: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn 199098a link true /test e2e-aws-ovn

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants