I'm looking for a way to have two GKE clusters (primary and backup) in different regions and direct traffic to the backup cluster when the primary one is unreachable.
I found a way to create a HTTP(s) load balancer with a backend service that has instance groups in different regions, but I can't find a way to configure this backend service to direct traffic to a single backend when it is available. Adjusting rate or utilization does not seem to ensure that the request will be sent to the primary cluster.
Someone has encountered a similar issue here.
Is there a way to do this with the GCP load balancing options? Maybe I missed it?