Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

GKE Workload DNS Timeouts

We have a series of microservice workloads on a couple of GKE clusters (1.25.10-gke.2700), these have been stable and largely untouched for the last couple of weeks. On Monday night, we started getting alerts that our microservices were down, and looking into it, all of our workloads appear to have the same, or similar errors appearing. When we are trying to download objects from CloudStorage we are receiving the "Resource Temporarily Unavailable" error, which I've read can be traced back to a DNS lookup timeout?

In addition to the CloudStorage errors, we are receiving errors when trying to download files hosted by our clients, or when connecting to our DB (using the URL). As a test we switched a few of the connection strings to point directly to the IP, which fixed the issue where the switch was made. This appears to prove the issue is with the DNS resolution.

After determining the cause of the errors was DNS based, we have tried enabling the NodeLocal DNS caching on our clusters and switched from kube-dns to Cloud DNS, neither of which has resolved the issue. Other than these steps, we've so far not been able to find much else to try and resolve the issue.

1 1 541
1 REPLY 1

Former Community Member
Not applicable

I am having a similar issue, honestly this is why I use Azure, Microsoft have their support down to a T

Top Labels in this Space