We had an issue with the wait command when waiting for DS replicas:
waitForDS() {
local ds=$1
waitResourceExists "sts/${ds}"
local rep=$($K_CMD $NAMESPACE_OPT get sts $ds -o=jsonpath='{.spec.replicas}')
((rep--))
message "rep=${rep}" "debug"
for i in $(seq 0 $rep) ; do
waitForResource ready "pod/${ds}-${i}"
done
}
This function checks individual pods for readiness, but does not check whether they have actually been updated. Additionally, it iterates in the wrong order (low to high), while statefulsets by default restart the pods from the highest number to the lowest.
This means the command can report success prematurely: the lower numbered pod hasn't been restarted yet (so it's still "ready" from the previous deployment), while the higher numbered pod has already been restarted and the wait command correctly waited for it.
Example:
In a cluster with two ds replicas
- The rollout terminates ds-*-1 first (highest number).
- The wait command checks ds-*-0 which is still ready from the previous deployment, so it passes immediately.
- The wait command checks ds-*-1 and waits for the new pod to become ready.
- The wait command reports success.
- The rollout now terminates ds-*-0 to update it, causing a brief period where other components need to realize the downtime.
In our case, this caused flaky amster imports in our deploy pipeline because amster was started right after the wait command succeeded, but the old and ready ds pod was just terminated.
We where able to get around this issue with the following:
waitForDS() {
local ds=$1
waitForRollout "statefulset/${ds}"
}
waitForRollout() {
local resource=$1
echo "Waiting for ${resource} rollout to complete."
if waitResourceExists "$resource" ; then
kube rollout status ${resource} --timeout="${TIMEOUT}s"
else
echo "ERROR: $resource not found. Skipping."
fi
}
This waits for the full rollout to complete (all pods updated to the new version and ready), regardless of order.
The question is of course, should it wait for the full rollout to complete (safer), or just for at least one pod to be ready (faster, sufficient for basic platform availability)?
For am and idm it waits for the deployments, why not for the statefulset for ds?
What was the intention here?
We had an issue with the wait command when waiting for DS replicas:
This function checks individual pods for readiness, but does not check whether they have actually been updated. Additionally, it iterates in the wrong order (low to high), while statefulsets by default restart the pods from the highest number to the lowest.
This means the command can report success prematurely: the lower numbered pod hasn't been restarted yet (so it's still "ready" from the previous deployment), while the higher numbered pod has already been restarted and the wait command correctly waited for it.
Example:
In a cluster with two ds replicas
In our case, this caused flaky amster imports in our deploy pipeline because amster was started right after the wait command succeeded, but the old and ready ds pod was just terminated.
We where able to get around this issue with the following:
This waits for the full rollout to complete (all pods updated to the new version and ready), regardless of order.
The question is of course, should it wait for the full rollout to complete (safer), or just for at least one pod to be ready (faster, sufficient for basic platform availability)?
For am and idm it waits for the deployments, why not for the statefulset for ds?
What was the intention here?