Health checks
Overview
Section titled “Overview”This document touches on health checks for Runway services deployed in Cloud Run.
Runway service owners can define 3 types of health check:
- startup: determines if a container has started and is ready to receive traffic.
- liveness: determines whether to restart a container. Depends on a successful startup probe.
- readiness: determines whether a container should receive traffic. Unlike startup and liveness probes, a failing readiness probe removes the container from the load balancer without restarting it.
For information on defining probes, refer to the Runway manifest schema. You may also refer to Cloud Run’s guide for more information and GCP recommended best practices.
NOTE: All Cloud Run services have a default TCP startup probe which tries to open a TCP connection on the container port.
Readiness probes
Section titled “Readiness probes”Readiness probes differ from startup and liveness probes in two important ways:
initial_delay_secondsis not supported.success_thresholdis supported (unique to readiness probes).
failure_threshold limit
Section titled “failure_threshold limit”Cloud Run enforces a maximum failure_threshold of 3 for readiness probes. This is stricter than startup and liveness probes, which allow much higher values.
Setting failure_threshold above 3 will cause a deployment error:
failure_threshold must be a number between 0 and 3.A typical readiness probe configuration looks like:
spec: readiness_probe: path: /-/readiness period_seconds: 5 timeout_seconds: 2 failure_threshold: 3 # maximum allowed by Cloud Run success_threshold: 1If your service uses LabKit v2, the /-/readiness endpoint is registered automatically by the httpserver package and runs all registered checks concurrently.
Examples
Section titled “Examples”Example of services using HTTP probes
Example of services using readiness probes
Example of services using gRPC probes