Global Load Balancing
Cloudflare Global Load Balancing
Section titled “Cloudflare Global Load Balancing”Runway provides global load balancing through Cloudflare to distribute traffic across multiple regions, improving latency for global users and increasing availability through automatic failover.
Overview
Section titled “Overview”Cloudflare Global Load Balancer sits at the edge of Cloudflare’s network (300+ data centers worldwide) and makes routing decisions at the Point of Presence (PoP) closest to the user. This enables:
- Latency-based routing: Traffic is automatically routed to the fastest available origin based on measured latency from each Cloudflare PoP (depending on the load balancer settings)
- Automatic failover: Unhealthy origins are removed from rotation within seconds
- Multi-region distribution: Enables distribution of traffic across multiple regions
How It Works
Section titled “How It Works”┌─────────────────┐│ User Request │└────────┬────────┘ │ ▼┌────────────────────────────┐│ Cloudflare PoP ││ (usually nearest to user) │└────────┬───────────────────┘ │ ▼┌─────────────────────────────────┐│ Global Load Balancer ││ - Check pool/endpoint health ││ - Compare latency ││ - Select fastest pool │└────────┬────────────────────────┘ │ ▼┌─────────────────────────────────────────┐│ Origin Pools(GKE Regional LBs) │├─────────────┬─────────────┬─────────────┤│ GKE US │ GKE EU │ GKE Asia ││ us-east1 │ eu-west1 │ asia-ne3 │└─────────────┴─────────────┴─────────────┘When a user makes a request, the request flow will be:
- The request arrives at the nearest Cloudflare PoP
- The load balancer checks health status and latency data for all origin pools and endpoints
- Traffic is routed to the pool with the lowest latency from that PoP or from that region (depending on the setting)
- If the selected pool becomes unhealthy, traffic automatically fails over to the next best pool within seconds
Configuration
Section titled “Configuration”Enabling Global Load Balancing
Section titled “Enabling Global Load Balancing”To enable global load balancing for your workload, add the following to your service entry in the provisioner’s workload inventory file (config/runtimes/[gke|eks]/workloads.yml):
- runway_service_id: my-service project_id: 12345678 cloudflare: global_loadbalancer: enabled: trueThis creates:
- Origin pools for each region where your service is deployed
- Health monitors to check endpoint availability
- A load balancer with dynamic latency-based steering (routes traffic to the origin with lowest latency)
Health Monitor Configuration
Section titled “Health Monitor Configuration”Health monitors probe your origins to determine availability and measure latency. Configure monitors based on your requirements:
- runway_service_id: my-service project_id: 12345678 cloudflare: global_loadbalancer: enabled: true monitor: protocol: tcp # tcp or https (default: tcp) path: /health # health endpoint path (https only, default /health) interval: 10 # seconds between checks (default: 10) timeout: 3 # seconds before marking failed (default: 3) consecutive_up: 1 # number of checks needed to mark healthy (default: 1) consecutive_down: 1 # number of checks needed to mark unhealthy (default: 1)Monitor Protocol Options
Section titled “Monitor Protocol Options”| Protocol | Use Case | Trade-offs |
|---|---|---|
| TCP (default) | Simple connectivity check | Fast, but only validates load balancer is reachable, the application does not receive the requests, the cloud provider’s load balancer does |
| HTTPS | Application-level health | More accurate (detects unhealthy backends), but adds load from health check requests - up to 70 rps per region |
TCP monitors only need the regional load balancer to respond on port 443 which is already configured to do so by Runway. You do not need to change your application. This setting has minimal overhead but lacks the accuracy of HTTPS monitor as it only validates network connectivity.
HTTPS monitors query a specific health endpoint on your application. Use this when you need more accurate health status that reflects backend availability. Using this setting means your application will receive health check traffic from Cloudflare.
Protecting Health Endpoints
Section titled “Protecting Health Endpoints”When using HTTPS monitors, you can block external access to your health endpoint while still allowing Cloudflare monitors. Runway achieves this using Cloudflare rules:
cloudflare: global_loadbalancer: enabled: true monitor: protocol: https path: /health block_health_endpoint: true # blocks public access to /healthAll Data Centers Monitoring
Section titled “All Data Centers Monitoring”By default, health checks run from a subset of Cloudflare regions. For more comprehensive latency measurement:
cloudflare: global_loadbalancer: enabled: true monitor: all_data_centers: true # probe from all Cloudflare PoPsComplete Configuration Example
Section titled “Complete Configuration Example”- runway_service_id: my-service project_id: 12345678 groups: - gitlab-org/my-team cloudflare: enabled: true global_loadbalancer: enabled: true monitor: protocol: https path: /health interval: 10 timeout: 3 consecutive_up: 2 consecutive_down: 2 block_health_endpoint: trueFailover Behavior
Section titled “Failover Behavior”When an origin pool becomes unhealthy:
- Health monitors detect the failure (based on
consecutive_downthreshold) - The unhealthy pool is removed from rotation
- Traffic automatically shifts to the next lowest-latency healthy pool
- When the pool recovers (based on
consecutive_upthreshold), it rejoins rotation
If all the pools become unhealthy, the us-east1 region for GKE and us-east-1 region for EKS will receive the traffic as a last resort, even if they are unhealthy.
Keep in mind that the failover behaviour might take up to (consecutive_down + 1) × interval seconds to remove unhealthy pools.
You can adjust consecutive_up, consecutive_down, interval and timeout to tune failover behavior and its sensitivity.
Multi-Region Deployment
Section titled “Multi-Region Deployment”Global load balancing works best when the service is deployed to multiple regions. While EKS services can use the Global load balancer setting, they currently only deploy to one region and therefore don’t provide the same multi-region failover benefits as GKE.
Current Supported Regions
Section titled “Current Supported Regions”| Cloud | Regions |
|---|---|
| GKE | us-east1, europe-west1, asia-northeast3 |
| EKS | us-east-1 |
Cost Implications
Section titled “Cost Implications”- HTTPS health checks from multiple regions increase request volume to your origins
- Using
all_data_centers: truefurther increases health check traffic
Support
Section titled “Support”For questions or issues with global load balancing, contact the Runway team in #f_runway.