Merged
Conversation
We currently get a few 500s sometime when gundeck restarts, where some current requests seem to get aborted mid-request. This is possibly due to terminating pods still getting some traffic. https://wearezeta.atlassian.net/browse/WPB-19694
supersven
reviewed
Aug 26, 2025
| lifecycle: | ||
| preStop: | ||
| exec: | ||
| command: ["sh", "-c", "sleep 10"] |
Contributor
There was a problem hiding this comment.
Unfortunately, I'm not an K8s expert: Could you please explain why we need the sleep command? Why isn't it good enough to only increase terminationGracePeriodSeconds? 🤔
Contributor
There was a problem hiding this comment.
the default terminationGracePeriodSeconds is 30 seconds, which is plenty.
What this sleep would do is delaying the time between the pod entering the Terminating state, and thus being removed from any Service that sends traffic there.
However, gundeck will still accept any in-flight requests and process them, since it doesn't know about being terminated.
Shamelessly adapted from SO to visualize and explain the process:
The full sequence is :
- pod deletion is requested (state:
Terminating)- preStop hook kicks in and
terminationGracePeriodSecondscountdown starts :
- when
preStophook completes,kubeletsends aSIGTERMto the container- if
preStophook isn't finished withinterminationGracePeriodSecondscountdown,kubeletsends SIGKILL to the container
→ this means:
- we delay the shutdown (
SIGTERM) to gundeck to allow for the k8s API to remove it from itsServiceand process - once the 10s counter is finished, the regular
SIGTERMsignal is sent, causinggundeckto gracefully shutdown - if the whole thing takes longer than
terminationGracePeriodSeconds, it's simply killed.
lwille
approved these changes
Sep 11, 2025
This was referenced Oct 20, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We currently get a few 500s sometime when gundeck restarts, where some current requests seem to get aborted mid-request. This is possibly due to terminating pods still getting some traffic.
https://wearezeta.atlassian.net/browse/WPB-19694
Checklist
changelog.d