GKE multi Cluster setup not working as expected

Hi Team,

I have three clusters named gke-1 (Auto Pilot Cluster), gke-2, and gke-3. All the clusters are deployed in separate projects and different VPCs. We have established a connection via VPC peering to allow communication between all the projects internally. I have set up the Multi-cluster service using the following documentation: https://cloud.google.com/kubernetes-engine/docs/how-to/migrate-gke-multi-cluster.

Our fleet host cluster is gke-1, and the registered clusters are gke-2 and gke-3. When I export a service from gke-1 into gke-2 and gke-3, it's working fine as expected. However, the issue arises when I export a service from gke-2 to gke-1 and gke-3 to gke-1; it's not working as expected. In this scenario, the services are exported from gke-2 to gke-1, and traffic director is created and appears healthy. But when I make an API call from a gke-1 pod to the gke-2 service, it throws a 503 error with the message: "Failed to connect to <gke-2-svc>.<namespace>.svc.clusterset.local port 80 after 131287 ms: Couldn't connect to the server."

Can anyone please help me debug this issue?

Solved Solved
0 3 424
1 ACCEPTED SOLUTION

We fixed the issue deploying new egress NAT Policy.

View solution in original post

3 REPLIES 3

Hi @Sath1ce 

Welcome to Google Cloud Community!

As per documentation, you can run the following command to debug and describe your multi-cluster services status:

gcloud container fleet multi-cluster-services describe

After running the command, you can evaluate the output based on the code status.

Lastly, in the documentation available, kindly note that there is limitations in the MCS in multiple projects.

I hope this information is helpful.

If you need further assistance, you can always file a ticket on our support team.

Yeah. I am executed the all the above steps and MCS service is up and running correctly. Somewhere in between it was failing over the network. But i can't able to see where it was failing. Can you please help the request flow for MCS services from one cluster to another cluster and how can we see the logs for MCS service. 
Note: Both the cluster running in private cluster with custom subnet & secondary IP address.

 

We fixed the issue deploying new egress NAT Policy.

Top Labels in this Space