Incident Response
Make it stop
Section titled “Make it stop”We use Flux to periodically reconcile the Kubernetes cluster state with the declarative intended state.
During incident response, this can undo ad-hoc fixes, preventing mitigation of the incident.
In this situation, reconciliation can be suspended with the flux
CLI:
flux suspend kustomization "${RUNWAY_SERVICE_ID:?}" -n runway-provisioner
You can then change cluster resources, e.g. with kubectl patch
, without Flux undoing these changes shortly after.
However, other controllers are still active and may interfere with your chagnes, such as the “Vertical Pod Autoscaler” which controls a Pod’s memory allocation.
After the incident, make sure to re-enable Flux reconciliation. This is also done using the Flux CLI:
flux resume kustomization "${RUNWAY_SERVICE_ID:?}" -n runway-provisioner