I have not been having any luck with my case via GCP Support, so figured I would ask this of the wider community to see if anyone can help put me on the right path.
I, no matter what I try, cannot pull Artifact Registry (private) images via Workload Identity.
Here is the relevant configs and logs. My service account (artifact-registry-sa) has overly permissive permissions for the sake of troubleshooting, but currently has:
* Added by support recommendation
I have also tried utilizing the service accounts the nodes were created using which again has the overly permissive role of roles/editor. Support recommended I proceed with this documentation. Which I pointed out has me manually SSH-ing into nodes--which is not possible with autoscaling nodes--as well as not utilizing Workload Identity at all which is the entire point of the case. If I wanted this to work I would just use a static JSON service account key and be done with it. That's not acceptable from a security standpoint for my organization.
I am chasing my tail here and am really at a loss. According to the documentation this should "just work". I have other services such as cert-manager and external-dns successfully using Workload Identity.
The SA is annotated and mapped correctly (as far as I can tell) to the GCP SA. The deployment and the SA are in the same namespace so that should not be an issue.
Does anyone have anything to check or try? Anyone else experienced/ing anything similar?
Solved! Go to Solution.
The service account has artifactregistry.writer which includes all the permissions from artifactregistry.reader. I did verify this.
The solution ended up being that you have to specify the use of the default service account:
...
spec:
serviceAccountName: default
In my testing I was leaving it off and assuming it would use the default.
Hi @patrickblackjr,
Welcome to the Google Cloud Community!
Main issue is that the 'default' service account does not have the required reader role (while documentation would say it is).
Here are the commands that you can do to troubleshoot the issue:
SERVICE_ACCOUNT="$(gcloud container clusters describe $CLUSTER_NAME --region=$REGION --format=json | jq .nodeConfig.serviceAccount)"
echo $SERVICE_ACCOUNT
default
means it is the default service account, which is:
PROJECT_NUM=$(gcloud projects describe "$PROJECT_ID" --format='get(projectNumber)')
SERVICE_ACCOUNT="${PROJECT_NUM}-compute@developer.gserviceaccount.com"
roles/artifactregistry.reader
gcloud projects get-iam-policy ${PROJECT_ID} --flatten="bindings[].members" \
--format='table(bindings.role)' --filter="bindings.members:${SERVICE_ACCOUNT}"
gcloud projects add-iam-policy-binding ${PROJECT_ID} --role="roles/artifactregistry.reader" --member="serviceAccount:${SERVICE_ACCOUNT}"
I hope this helps. Thank you.
The service account has artifactregistry.writer which includes all the permissions from artifactregistry.reader. I did verify this.
The solution ended up being that you have to specify the use of the default service account:
...
spec:
serviceAccountName: default
In my testing I was leaving it off and assuming it would use the default.
Nice and simple, glad you figured it out! Did it also work with your artifact-registry-sa k8s serviceaccount?
If you're still experiencing this, can you double-check a couple of things?