For example, I want two clusters to each be in a different region so I have basic region redundancy. Some of the services I want to run are stateful (which includes keeping track of user sessions). I am not looking to maintain state if a cluster fails. I am just wanting a cluster B that I can direct the remaining traffic to if cluster A goes down. I see there are many ways to go about this. I looked at examples in the following links:
The picture in the first link falls short due to it looking like the stateful service is just deployed in a single cluster. I want that stateful service to be deployed in all the clusters and, ideally, if in the event of the cluster failing, I would want all services choosing the next available stateful backend and sticking with it.
The third example looks almost like the setup I’m looking for. It seems that I would only need to do ServiceExport on the user facing services. But I have the following questions:
Hi @g3289ds ,
Based on the information that you shared, you want to deploy two clusters in different regions on GKE for basic region redundancy. Stateful services, such as user session tracking, should be deployed across both clusters. In the event of a cluster failure, your aim is to reroute traffic to the functional cluster.
@g3289ds wrote:
The third example looks almost like the setup I’m looking for. It seems that I would only need to do ServiceExport on the user facing services. But I have the following questions:
From here, you are considering a multi-cluster gateway setup using a load balancer to distribute traffic across clusters.
@g3289ds wrote:
Will store.example.com without trailing slash go to the fail state? Is the fail state only if the actual gateway IP is entered into browser?
To address your question, accessing "store.example.com" without a trailing slash may not necessarily result in a failure. The outcome depends on your service configurations. Fail states are usually reserved for errors or issues in the load balancing process, rather than simply accessing the domain without a specific path.
@g3289ds wrote:
If I just specify the path prefix to be / so that all traffic can go to either cluster and so I don’t have to rewrite any site logic that relies on the URL structure, will the gateway service/resource intelligently map user traffic to the correct cluster (by ip address or other) so that if a user’s browser crashes, tab refreshes, user pastes a direct link to a page in a new tab while logged in an existing tab, or just even basic request to request remain at the same cluster as the first request? If not how can I accomplish this?
If you choose / as the path, letting all traffic go to any cluster, the load balancer will smartly share the traffic. It always checks if your pods in the clusters are healthy and guides traffic to the good ones. This helps handle problems by sending traffic to healthy parts. Load balancers support session affinity to direct requests from the same client to the same backend instance, ensuring a consistent user experience.
Recommendations:
By following these considerations, you can set up a strong and reliable multi-cluster system on GKE that ensures your stateful services can recover well in case of any issues.
Let me know if this answers your questions.