Hi GKE Team,
Few Pods are CrashLoopBackOff when cluster is provisioned with control-plane version 1.27, node pool version 1.23 and security field workload identity is enabled.
logs from gke-metadata-server pod
I0905 07:26:05.080248 1 prodlayer.go:217] layer successfully set to NO_LAYER with source DEFAULT
I0905 07:26:05.081345 1 server.go:205] Got error when getting machine owner: failed to get proto for GMI path: open /etc/googlemachineidentity/live/machine_identity.pb: no such file or directory
I0905 07:26:05.081432 1 main.go:89] Build CL: 554917814
I0905 07:26:05.081463 1 main.go:90] Build baseline CL: 554917814
I0905 07:26:05.081492 1 main.go:91] Build label: gke_metadata_server_20230808.00_p0
I0905 07:26:05.081515 1 main.go:92] Command line arguments:
I0905 07:26:05.081537 1 main.go:94] argv[0]: "/gke-metadata-server"
I0905 07:26:05.081562 1 main.go:94] argv[1]: "--logtostderr"
I0905 07:26:05.081589 1 main.go:94] argv[2]: "--token-exchange-endpoint=https://securetoken.googleapis.com/v1/identitybindingtoken"
I0905 07:26:05.081624 1 main.go:94] argv[3]: "--workload-pool=dev-382813.svc.id.goog"
I0905 07:26:05.081650 1 main.go:94] argv[4]: "--alts-service-suffixes-using-node-identity=storage.googleapis.com,bigtable.googleapis.com,bigtable2.googleapis.com,bigtablerls.googleapis.com,spanner.googleapis.com,spanner2.googleapis.com,spanner-rls.googleapis.com,grpclb.directpath.google.internal,grpclb-dualstack.directpath.google.internal,staging-wrenchworks.sandbox.googleapis.com,preprod-spanner.sandbox.googleapis.com,wrenchworks-loadtest.googleapis.com,wrenchworks-nonprod.googleapis.com"
I0905 07:26:05.081687 1 main.go:94] argv[5]: "--identity-provider=https://container.googleapis.com/v1/projects/dev-382813/locations/us-central1-c/clusters/cluster-3"
I0905 07:26:05.081731 1 main.go:94] argv[6]: "--passthrough-ksa-list=anthos-identity-service:gke-oidc-envoy-sa,kube-system:container-watcher-pod-reader,kube-system:coredns,kube-system:egress-nat-controller,kube-system:event-exporter-sa,kube-system:fluentd-gcp-scaler,kube-system:gke-spiffe-node-agent,kube-system:heapster,kube-system:konnectivity-agent,kube-system:kube-dns,kube-system:maintenance-handler,kube-system:metadata-agent,kube-system:network-metering-agent,kube-system:node-local-dns,kube-system:pkgextract-service,kube-system:pkgextract-cleanup-service,kube-system:securityprofile-controller,istio-system:istio-ingressgateway-service-account,istio-system:cluster-local-gateway-service-account,csm:csm-sync-agent,knative-serving:controller,kube-system:pdcsi-node-sa,gmp-system:collector,gke-gmp-system:collector,gke-managed-cim:kube-state-metrics"
I0905 07:26:05.081779 1 main.go:94] argv[7]: "--attributes=cluster-name=cluster-3,cluster-uid=bfb333c4316d44419b65f8febb2459ad475f76fe22d044e5a88cf34d96a94ae4,cluster-location=us-central1-c"
I0905 07:26:05.081813 1 main.go:94] argv[8]: "--cluster-uid=bfb333c4316d44419b65f8febb2459ad475f76fe22d044e5a88cf34d96a94ae4"
I0905 07:26:05.081848 1 main.go:94] argv[9]: "--sts-endpoint=https://sts.googleapis.com"
I0905 07:26:05.081874 1 main.go:94] argv[10]: "--token-exchange-mode=sts"
I0905 07:26:05.081899 1 main.go:94] argv[11]: "--component-version=0.4.276"
I0905 07:26:05.081943 1 main.go:102] Creating CRI connection
I0905 07:26:05.082003 1 main_linux.go:45] "unix:///var/run/gke-sandboxd/gke-sandboxd.sock" doesn't exist
I0905 07:26:05.082063 1 main_linux.go:45] "unix:///var/run/dockershim.sock" doesn't exist
I0905 07:26:05.082125 1 main_linux.go:62] Use CRI URL: unix:///var/run/containerd/containerd.sock
I0905 07:26:05.084243 1 cri.go:40] Finding the CRI API runtime version
I0905 07:26:05.086711 1 cri.go:46] Falling back to CRI v1alpha2 runtime API (deprecated)
I0905 07:26:05.086760 1 main.go:108] Fetching static GCE metadata
I0905 07:26:05.097979 1 main.go:155] Creating Clientset
F0905 07:26:05.099095 1 main.go:158] Failed to create kubernetes clientset: exec plugin: invalid apiVersion "client.authentication.k8s.io/v1alpha1"
Hello @sharan_rafay
Here's a 3rd-party documentation for your reference that is discussing what may cause the error and what possible solution that can fix the issue.
Kubernetes itself only supports node pools being N-2 version behind the control plane (and that assumes you are not using any APIs which have been deprecated). There have been a number of deprecations in from 1.23 to 1.27.
any reason why still GKE supports ?
Are you using a release channel or are you on static | no channel?
@sharan_rafay Could you share what was the solution ?
@barakotai didn't find any solutions to this