Skip to content

WIP: Gracefully stop kube-apiserver before MCO reboot#5775

Open
saschagrunert wants to merge 1 commit intoopenshift:mainfrom
saschagrunert:graceful-stop-apiserver-before-reboot
Open

WIP: Gracefully stop kube-apiserver before MCO reboot#5775
saschagrunert wants to merge 1 commit intoopenshift:mainfrom
saschagrunert:graceful-stop-apiserver-before-reboot

Conversation

@saschagrunert
Copy link
Copy Markdown
Member

@saschagrunert saschagrunert commented Mar 17, 2026

- What I did

Alternative to #5708 that avoids enabling GracefulNodeShutdown (GNS).

Without GNS, kubelet exits immediately on SIGTERM without terminating pods during MCO-triggered reboots. kube-apiserver needs up to 194s for graceful shutdown but systemd's DefaultTimeoutStopSec is 90s, so it gets SIGKILLed.

The transient systemd reboot unit now gracefully stops kube-apiserver before rebooting:

  1. Query for kube-apiserver containers via crictl ps (no-op on workers)
  2. If found, stop kubelet (timeout 30 systemctl stop kubelet) to prevent static pod restarts
  3. Gracefully stop kube-apiserver containers (crictl stop --timeout 200)
  4. Proceed with systemctl reboot

Failures in steps 2 and 3 are logged via logger and do not block the reboot. The transient unit has TimeoutStartSec=300 to prevent indefinite hangs if CRI-O is stuck.

This works because kube-apiserver runs under watch-termination (PID 1 in the container). crictl stop sends SIGTERM to watch-termination, which forwards it to kube-apiserver. After graceful shutdown completes, watch-termination removes its lock file (/var/log/kube-apiserver/.terminating) via defer and exits. On next startup, the absence of this file indicates graceful termination. No dependency on kubelet for detection.

- How to verify it

  • Deploy to a cluster with MCO-triggered reboots (e.g. via KubeletConfig change)
  • Run [sig-api-machinery][Feature:APIServer][Late] kubelet terminates kube-apiserver gracefully extended
  • Verify /var/log/kube-apiserver/.terminating is absent after reboot

- Description for the changelog

Gracefully stop kube-apiserver containers before MCO-triggered reboots to prevent non-graceful termination.

@openshift-ci-robot openshift-ci-robot added jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Mar 17, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@saschagrunert: This pull request references Jira Issue OCPBUGS-75200, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

- What I did

Alternative to #5708 that avoids enabling GracefulNodeShutdown (GNS).

Without GNS configured in kubelet, kube-apiserver gets SIGKILLed during MCO-triggered reboots because systemd's DefaultTimeoutStopSec (90s) is shorter than kube-apiserver's graceful shutdown period (up to 194s). Before issuing systemctl reboot, this PR:

  1. Stops kubelet via systemctl stop kubelet (prevents it from restarting static pods)
  2. Gracefully stops kube-apiserver containers via crictl stop --timeout 200
  3. Proceeds with the reboot

This is a targeted workaround that does not require changes to kubelet configuration or enabling GNS, which has known issues with networking pods during shutdown (see SUPPORTEX-23094, kubernetes/enhancements#4565).

- How to verify it

  • Deploy to a cluster with MCO-triggered reboots (e.g. via KubeletConfig change)
  • Run [sig-api-machinery][Feature:APIServer][Late] kubelet terminates kube-apiserver gracefully extended test
  • Verify kube-apiserver containers terminate gracefully before reboot in journal logs

- Description for the changelog

Gracefully stop kube-apiserver containers before MCO-triggered reboots to prevent non-graceful termination.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 17, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f3b5eafb-b9bd-4fd5-9ccb-811157f10d51

📥 Commits

Reviewing files that changed from the base of the PR and between 96339a9 and 27e2581.

📒 Files selected for processing (3)
  • pkg/daemon/daemon.go
  • pkg/daemon/daemon_test.go
  • pkg/daemon/update.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/daemon/update.go

Walkthrough

Reboot logic in pkg/daemon now runs via a transient systemd unit that executes an embedded shell script to stop kubelet and kube-apiserver containers (crictl with timeouts) on control-plane nodes, then reboots. A minor comment was changed in pkg/daemon/update.go. A unit test for rebootCommand was added. No public APIs changed.

Changes

Cohort / File(s) Summary
Reboot logic
pkg/daemon/daemon.go
Replaces direct systemd-run reboot invocation with a transient unit that runs an injected multi-line shell script. Script gracefully stops kubelet (30s attempt), stops/re-lists kube-apiserver containers via crictl with a 200s timeout, and issues systemctl reboot. Adds detailed inline comments about timing and control-plane scoping and sets TimeoutStartSec=300s.
Comment update
pkg/daemon/update.go
Small comment change: reference updated from "GracefulNodeShutdown" to "the system to shut down". No functional changes.
Tests
pkg/daemon/daemon_test.go
Adds TestRebootCommand which imports strings and validates constructed rebootCommand tokens for scenarios with/without the workaround flag, ensuring presence of systemd-run unit args and expected shell script contents (kubelet/crictl/systemctl commands).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

@saschagrunert saschagrunert force-pushed the graceful-stop-apiserver-before-reboot branch from a379c02 to ee9a19e Compare March 17, 2026 14:58
@openshift-ci openshift-ci bot requested review from cheesesashimi and djoshy March 17, 2026 14:58
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 17, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: saschagrunert
Once this PR has been reviewed and has the lgtm label, please assign isabella-janssen for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 17, 2026

@saschagrunert: trigger 2 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • e2e-aws-ovn
  • e2e-gcp-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/34da6640-2212-11f1-82ca-7e4d0a294189-0

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/daemon/daemon.go`:
- Around line 285-292: The current logic captures CONTAINERS before stopping
kubelet, allowing a static-pod restart to create a new kube-apiserver container
that won't be stopped; after calling timeout 30 systemctl stop kubelet (in the
block around CONTAINERS and the for loop), re-query crictl (e.g., reassign
CONTAINERS="$(crictl ps -q --label io.kubernetes.container.name=kube-apiserver
2>/dev/null)") and use that refreshed CONTAINERS list for the subsequent loop
that stops containers (the logger and crictl stop --timeout 200 "$cid" calls),
ensuring any new kube-apiserver container started during the kubelet-stop window
is also stopped.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 168e9546-c887-4a2c-988f-55fc33d0b978

📥 Commits

Reviewing files that changed from the base of the PR and between b06051f and ee9a19e.

📒 Files selected for processing (2)
  • pkg/daemon/daemon.go
  • pkg/daemon/update.go

@saschagrunert saschagrunert force-pushed the graceful-stop-apiserver-before-reboot branch from ee9a19e to 96339a9 Compare March 17, 2026 15:22
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 17, 2026

@saschagrunert: trigger 2 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • e2e-aws-ovn
  • e2e-gcp-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/0c4f2ec4-2215-11f1-91a0-41630963d78d-0

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 17, 2026

@saschagrunert: trigger 2 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • e2e-aws-ovn
  • e2e-gcp-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/195f7560-2215-11f1-8139-783d852a0fca-0

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/daemon/daemon.go`:
- Around line 288-296: The graceful-shutdown block currently swallows failures
from "systemctl stop kubelet" and "crictl stop" (using "|| true" and redirecting
stderr) but unconditionally logs success; change the logic to detect and surface
failures: run "systemctl stop kubelet" and capture its exit/code/output and log
via logger on non-zero exit; iterate CONTAINERS and for each cid run "crictl
stop --timeout 200 $cid", capture exit code and stderr/stdout and log a
per-container failure (including the cid and command output) instead of
discarding it, and only emit the final 'kube-apiserver containers stopped'
logger if all stops succeeded (or emit a different summary/logger showing which
cids failed). Reference the existing symbols: systemctl stop kubelet,
CONTAINERS, crictl stop --timeout 200, and logger when making these changes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a9405367-d25c-4e88-b9ac-5ffc667ad647

📥 Commits

Reviewing files that changed from the base of the PR and between ee9a19e and 96339a9.

📒 Files selected for processing (2)
  • pkg/daemon/daemon.go
  • pkg/daemon/update.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/daemon/update.go

@saschagrunert saschagrunert force-pushed the graceful-stop-apiserver-before-reboot branch from 96339a9 to 72002f6 Compare March 17, 2026 15:41
@saschagrunert
Copy link
Copy Markdown
Member Author

/payload-job e2e-aws-ovn
/payload-job e2e-gcp-ovn

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 17, 2026

@saschagrunert: trigger 2 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • e2e-aws-ovn
  • e2e-gcp-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/e260b4cc-2217-11f1-9ae7-47fc522fbc75-0

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 17, 2026

@saschagrunert: trigger 2 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • e2e-aws-ovn
  • e2e-gcp-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/e4ad2260-2217-11f1-9836-53d14348b10e-0

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
@saschagrunert saschagrunert force-pushed the graceful-stop-apiserver-before-reboot branch from 72002f6 to 27e2581 Compare March 17, 2026 15:47
@saschagrunert
Copy link
Copy Markdown
Member Author

/payload-job e2e-aws-ovn
/payload-job e2e-gcp-ovn

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 17, 2026

@saschagrunert: trigger 2 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • e2e-aws-ovn
  • e2e-gcp-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/9d670550-2218-11f1-8f4d-a00412492feb-0

@saschagrunert
Copy link
Copy Markdown
Member Author

/retest

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 17, 2026

@saschagrunert: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@isabella-janssen
Copy link
Copy Markdown
Member

/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-upgrade-fips

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 19, 2026

@isabella-janssen: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-upgrade-fips

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/8c756bb0-2399-11f1-9aee-22be1e273c3a-0

@isabella-janssen
Copy link
Copy Markdown
Member

/payload-aggregate periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-upgrade-fips 7

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 19, 2026

@isabella-janssen: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-upgrade-fips

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/ce567fd0-23c9-11f1-81c9-4945ff419b7b-0

@BhargaviGudi
Copy link
Copy Markdown

Pre-merge verification details for this PR.
The graceful shutdown mechanism for kube-apiserver before MCO-triggered reboots is working as expected.

Cluster Details:
Cluster: Created via clusterbot (launch 4.22, openshift/machine-config-operator#5775, AWS, OVN)
OpenShift Version: 4.22.0-0-2026-03-20-040629-test-ci-ln-tmdntyk-latest
MCO Version: 4.22.0-0-2026-03-20-040629-test-ci-ln-tmdntyk-latest

Test Scenario: A reboot was triggered via a KubeletConfig change targeting the master MCP.

Key Results

  • All unit tests passed
  • kube-apiserver containers stopped gracefully
oc debug node/$NODE_NAME -- chroot /host journalctl -u machine-config-daemon-reboot.service | tail 
-- Boot 5263a021c6384ac1b6ff84f81c042860 --
Mar 20 05:36:29 ip-10-0-39-103 systemd[1]: Started machine-config-daemon: Node will reboot into config rendered-master-bc8be91d3d89e82442508a19149a849c.
Mar 20 05:36:29 ip-10-0-39-103 systemctl[31646]: Warning: The unit file, source configuration file or drop-ins of kubelet.service changed on disk. Run 'systemctl daemon-reload' to reload units.
Mar 20 05:36:29 ip-10-0-39-103 root[31658]: machine-config-daemon: gracefully stopping kube-apiserver container 64970eb031efc9bffb79f8ad0fc196537ca5d232b5134d647e9d1654ec9b1425
Mar 20 05:38:40 ip-10-0-39-103 sh[31659]: 64970eb031efc9bffb79f8ad0fc196537ca5d232b5134d647e9d1654ec9b1425
Mar 20 05:38:40 ip-10-0-39-103 root[31977]: machine-config-daemon: kube-apiserver containers stopped
Mar 20 05:38:41 ip-10-0-39-103 systemd[1]: machine-config-daemon-reboot.service: Deactivated successfully.
Mar 20 05:38:41 ip-10-0-39-103 systemd[1]: Stopped machine-config-daemon: Node will reboot into config rendered-master-bc8be91d3d89e82442508a19149a849c.
  • Shutdown duration: ~2m 11s (within 200s timeout)
  • No timeout errors or failures observed
  • Reboot sequence completed successfully
  • All 3 master nodes updated successfully
  • Cluster remained healthy and stable post-reboot
  • Test [sig-api-machinery][Feature:APIServer][Late] kubelet terminates kube-apiserver gracefully extended [Suite:openshift/conformance/parallel] - Passed

Observation: The /var/log/kube-apiserver/.terminating file exists after reboot on all 3 master nodes but did not impact graceful shutdown. @saschagrunert Could you please help to confirm this behavior.

Refer this for full test details and log snippets

@BhargaviGudi
Copy link
Copy Markdown

/payload-job periodic-ci-openshift-machine-config-operator-release-4.22-periodics-e2e-aws-ovn-ocl

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 20, 2026

@BhargaviGudi: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-machine-config-operator-release-4.22-periodics-e2e-aws-ovn-ocl

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/417aabd0-242e-11f1-9a03-7cfbdadab570-0

@saschagrunert
Copy link
Copy Markdown
Member Author

saschagrunert commented Mar 20, 2026

Observation: The /var/log/kube-apiserver/.terminating file exists after reboot on all 3 master nodes but did not impact graceful shutdown. @saschagrunert Could you please help to confirm this behavior.

Yes, this is expected. The .terminating file is managed by the watch-termination process that wraps kube-apiserver (in openshift/kubernetes). It works like this:

  1. On startup, watch-termination touches/creates the file as a liveness sentinel
  2. On clean SIGTERM shutdown, it deletes the file via deferred cleanup
  3. If the process is killed (SIGKILL/crash) without cleanup, the file persists, and the next startup detects it, logs a NonGracefulTermination warning event, and deletes the old file before creating a new one

The files you observed are new ones created by the freshly started kube-apiserver pods after reboot, not leftovers from the previous run. The modification timestamps confirm this (e.g. 05:40 on ip-10-0-39-103, which rebooted at 05:38). The previous instance's file was cleaned up during the graceful crictl stop.

@BhargaviGudi
Copy link
Copy Markdown

/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-rosa-sts-ovn

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 20, 2026

@BhargaviGudi: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-4.22-e2e-rosa-sts-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/f07939f0-2431-11f1-86fd-739002f0c9de-0

@BhargaviGudi
Copy link
Copy Markdown

/payload-job periodic-ci-openshift-release-main-nightly-4.20-e2e-rosa-sts-ovn

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 20, 2026

@BhargaviGudi: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-4.20-e2e-rosa-sts-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/0b177f10-2432-11f1-9fd3-99bbd4c5563f-0

@sdodson
Copy link
Copy Markdown
Member

sdodson commented Mar 20, 2026

/payload-job periodic-ci-openshift-release-main-nightly-4.20-e2e-rosa-sts-ovn

I'm almost certain that the ROSA jobs only function on nightlies because they have to watch for new nightlies and propagate them into ClusterImageSets, we've never gotten presubmits to work there. The upgrade job that @isabella-janssen ran is probably our best bet, though when we ran that here and on #5782 the build failure rate was really bad, but lets try it again here too, maybe the builders are happier today.

/payload-aggregate periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-upgrade-fips 7

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 20, 2026

@sdodson: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-upgrade-fips

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/915051d0-2480-11f1-865f-cf7cfd2dc1dd-0

@BhargaviGudi
Copy link
Copy Markdown

BhargaviGudi commented Mar 23, 2026

[sig-api-machinery][Feature:APIServer][Late] kubelet terminates kube-apiserver gracefully extended test passed in periodic-ci-openshift-release-main-nightly-4.22-e2e-aws-ovn-upgrade-fips

@BhargaviGudi
Copy link
Copy Markdown

@isabella-janssen @sdodson Could you please help me run ci job on rosa clusters? Thanks

@sdodson
Copy link
Copy Markdown
Member

sdodson commented Mar 23, 2026

No, that's not possible without adding new jobs here and it's not worth that effort.

@sdodson
Copy link
Copy Markdown
Member

sdodson commented Mar 23, 2026

/hold
I'd like to pursue the simpler timeout option

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 23, 2026
@sdodson
Copy link
Copy Markdown
Member

sdodson commented Mar 23, 2026

/retitle WIP: Gracefully stop kube-apiserver before MCO reboot

@openshift-ci openshift-ci bot changed the title OCPBUGS-75200: Gracefully stop kube-apiserver before MCO reboot WIP: Gracefully stop kube-apiserver before MCO reboot Mar 23, 2026
@openshift-ci-robot openshift-ci-robot removed jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Mar 23, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@saschagrunert: No Jira issue is referenced in the title of this pull request.
To reference a jira issue, add 'XYZ-NNN:' to the title of this pull request and request another refresh with /jira refresh.

Details

In response to this:

- What I did

Alternative to #5708 that avoids enabling GracefulNodeShutdown (GNS).

Without GNS, kubelet exits immediately on SIGTERM without terminating pods during MCO-triggered reboots. kube-apiserver needs up to 194s for graceful shutdown but systemd's DefaultTimeoutStopSec is 90s, so it gets SIGKILLed.

The transient systemd reboot unit now gracefully stops kube-apiserver before rebooting:

  1. Query for kube-apiserver containers via crictl ps (no-op on workers)
  2. If found, stop kubelet (timeout 30 systemctl stop kubelet) to prevent static pod restarts
  3. Gracefully stop kube-apiserver containers (crictl stop --timeout 200)
  4. Proceed with systemctl reboot

Failures in steps 2 and 3 are logged via logger and do not block the reboot. The transient unit has TimeoutStartSec=300 to prevent indefinite hangs if CRI-O is stuck.

This works because kube-apiserver runs under watch-termination (PID 1 in the container). crictl stop sends SIGTERM to watch-termination, which forwards it to kube-apiserver. After graceful shutdown completes, watch-termination removes its lock file (/var/log/kube-apiserver/.terminating) via defer and exits. On next startup, the absence of this file indicates graceful termination. No dependency on kubelet for detection.

- How to verify it

  • Deploy to a cluster with MCO-triggered reboots (e.g. via KubeletConfig change)
  • Run [sig-api-machinery][Feature:APIServer][Late] kubelet terminates kube-apiserver gracefully extended
  • Verify /var/log/kube-apiserver/.terminating is absent after reboot

- Description for the changelog

Gracefully stop kube-apiserver containers before MCO-triggered reboots to prevent non-graceful termination.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants