Skip to content

Incident Response

We use Flux to periodically reconcile the Kubernetes cluster state with the declarative intended state. During incident response, this can undo ad-hoc fixes, preventing mitigation of the incident. In this situation, reconciliation can be suspended with the flux CLI:

Terminal window
flux suspend kustomization "${RUNWAY_SERVICE_ID:?}" -n runway-provisioner

You can then change cluster resources, e.g. with kubectl patch, without Flux undoing these changes shortly after. However, other controllers are still active and may interfere with your chagnes, such as the “Vertical Pod Autoscaler” which controls a Pod’s memory allocation.

After the incident, make sure to re-enable Flux reconciliation. This is also done using the Flux CLI:

Terminal window
flux resume kustomization "${RUNWAY_SERVICE_ID:?}" -n runway-provisioner