I have a created an HTTPS 'gce-internal' ingress that routes traffic coming into my Anthos cluster. My ingress was working as expected till today but all of a sudden it started throwing the following error - Error syncing to GCP: error running backend syncing routine: googleapi: Error 404: The resource 'projects/xxxxxx/zones/europe-west2-a/networkEndpointGroups/xxxxxx' was not found. Most strange part is that this error is coming up for the API which was already working fine under the same ingress. I tried different approaches to fix the issue but no luck so far. Following is the manifest for the ingress file -
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: ingress-name
annotations:
kubernetes.io/ingress.regional-static-ip-name: staticip
kubernetes.io/ingress.class: "gce-internal"
kubernetes.io/ingress.allow-http: "false"
ingress.gcp.kubernetes.io/pre-shared-cert: certname
spec:
rules:
- host: hostname
http:
paths:
- path: /pathofapi/*
pathType: ImplementationSpecific
backend:
service:
name: apiname-v1
port:
number: 8080
Following is manifest for my service -
apiVersion: v1
kind: Service
metadata:
name: apiname-v1
labels:
app: apiname
version: v1
annotations:
cloud.google.com/neg: '{"ingress": true}'
cloud.google.com/backend-config: '{"default": "backend-configuration"}'
spec:
ports:
- name: http
port: 8080
selector:
app: apiname
version: v1
Hi @iamnitprakash,
Based from the error that you're getting, there seems to be a mismatch in version. I have no access to your project so it will be helpful if you will be adding logs to your question. This is also to check what were the recent changes to your configuration prior to the error, "Error syncing to GCP: error running backend syncing routine:"
For now, what I can suggest is to delete the existing ingress, and create a new one.
I started seeing this error yesterday on a regular GKE cluster after doing a rollout of a new container image for a StatefulSet. This cluster has been running in production for about 2 years and it's the first time I see this error.
Here's what I've observed:
I tried to delete the Ingress as suggested, but that didn't help. I even tried deleting the Service, the StatefulSet and the Ingress, waited until all the Load Balancer resources were not visible anymore on GCP console, recreated the resources and still got the exact same error. Strangely it's only this Service that's having this error; all other services work perfectly.
What's even stranger is that I was able to finally workaround the issue by creating a Service with a different name - this new Service is exactly the same as the old service except for the metadata.name field. If I try to create the Service with the old name, the problem happens again.
I am also seeing issues with NEGs not being created for a service I have had for a long time.
We experienced a similiar issue. Turns out we had added a typo to the network endpoint group annotation in our service manifest. When we recreated the service it didn't recreate the NEG.
apiVersion: v1
kind: Service
metadata:
name: some-service
annotations:
cloud.google.com/neg: '{"ingress":true}'
networking.gke.io/load-balancer-type: "Internal" # Internal loabalancer
Make sure you add
cloud.google.com/neg: '{"ingress":true}'