Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

ImagePullBackOff using custom image

Hi,

I've created a GKE cluster using the terraform module. I've given the service account the required permissions and I verified the artifact registry image url but my deployment keeps failing with ImagePullBackOff error.
I used a public GCP image without issue. 
What am I missing?
Thanks

1 6 1,939
6 REPLIES 6

Which permission(s) did you grant to the node service account?  And is Artifact Registry is the same project or in a different project?

Hi,

It's all in the same project and the SA has artifactory reader and storage wiewer

Could you post some details of your deployment by K8S CLI? (kubectl describe deployment <deployment_name>)

Thanks 🙂

Hi,

I continued to try to solve the issue and I created a new node pool for ARM64 workloads. The original image was built for arm arch.
Here you have the deployment description (removed envs):

Name: sdk-virtual-pet
Namespace: sdk-apps
CreationTimestamp: Sun, 03 Sep 2023 18:18:39 +0000
Labels: <none>
Annotations: deployment.kubernetes.io/revision: 3
Selector: app=sdk-virtual-pet-app
Replicas: 1 desired | 1 updated | 1 total | 0 available | 1 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=sdk-virtual-pet-app
Containers:
sdk-virtual-pet:
Image: us-west1-docker.pkg.dev/topia-gcp/sdk-apps/sdk-virtual-pet
Port: 3000/TCP
Host Port: 0/TCP
Environment:

Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Progressing True NewReplicaSetAvailable
Available False MinimumReplicasUnavailable
OldReplicaSets: sdk-virtual-pet-68784fdddc (0/0 replicas created), sdk-virtual-pet-56775fb76 (0/0 replicas created)
NewReplicaSet: sdk-virtual-pet-6b84595958 (1/1 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 56m (x2 over 9h) deployment-controller Scaled up replica set sdk-virtual-pet-56775fb76 to 1 from 0
Normal ScalingReplicaSet 23m (x2 over 57m) deployment-controller Scaled down replica set sdk-virtual-pet-56775fb76 to 0 from 1
Normal ScalingReplicaSet 22m deployment-controller Scaled up replica set sdk-virtual-pet-6b84595958 to 1 from 0

I solved it by linking SA to Workload Identity.

https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity

I kept trying to get the gke cluster to deploy an ARM image but it always fails.
I have an ARM64 node pool with the taint :
Taints 

NoSchedulekubernetes.io/arch=arm64
I added the nodeselector to tolerate that taint:
nodeSelector:
kubernetes.io/arch: arm64

and I still get this error:

Cannot schedule pods: Preemption is not helpful for scheduling.



reason{
messageId"no.scale.up.mig.failing.predicate"
parameters[
0"TaintToleration"
1"node(s) had untolerated taint {kubernetes.io/arch: arm64}"
 
If I remove the node_selector, I get the error that the node has taints, if I added, I get the error that there are no nodes with the taint.
This is taking me too many days and I can't get rid of the errors. Can someone help me understand what can be wrong?
Here are the definitions:

apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "7"
creationTimestamp: "2023-09-03T18:18:39Z"
generation: 22
managedFields:
- apiVersion: apps/v1
fieldsType: FieldsV1
fieldsV1:
f:spec:
f:progressDeadlineSeconds: {}
f:revisionHistoryLimit: {}
f:selector: {}
f:strategy:
f:rollingUpdate:
.: {}
f:maxSurge: {}
f:maxUnavailable: {}
f:type: {}
f:template:
f:metadata:
f:labels:
.: {}
f:app: {}
f:spec:
f:automountServiceAccountToken: {}
f:containers:
k:{"name":"sdk-virtual-pet"}:
.: {}
f:env:
.: {}
k:{"name":"API_KEY"}:
.: {}
f:name: {}
f:valueFrom:
.: {}
f:secretKeyRef: {}
k:{"name":"BROWSER"}:
.: {}
f:name: {}
f:valueFrom:
.: {}
f:secretKeyRef: {}
k:{"name":"IMG_ASSET_ID"}:
.: {}
f:name: {}
f:valueFrom:
.: {}
f:secretKeyRef: {}
k:{"name":"INSTANCE_DOMAIN"}:
.: {}
f:name: {}
f:valueFrom:
.: {}
f:secretKeyRef: {}
k:{"name":"INSTANCE_PROTOCOL"}:
.: {}
f:name: {}
f:valueFrom:
.: {}
f:secretKeyRef: {}
k:{"name":"INTERACTIVE_KEY"}:
.: {}
f:name: {}
f:valueFrom:
.: {}
f:secretKeyRef: {}
k:{"name":"INTERACTIVE_SECRET"}:
.: {}
f:name: {}
f:valueFrom:
.: {}
f:secretKeyRef: {}
k:{"name":"NODE_ENV"}:
.: {}
f:name: {}
f:valueFrom:
.: {}
f:secretKeyRef: {}
k:{"name":"PORT"}:
.: {}
f:name: {}
f:valueFrom:
.: {}
f:secretKeyRef: {}
k:{"name":"REACT_APP_API_URL"}:
.: {}
f:name: {}
f:valueFrom:
.: {}
f:secretKeyRef: {}
f:image: {}
f:imagePullPolicy: {}
f:name: {}
f:ports:
.: {}
k:{"containerPort":3000,"protocol":"TCP"}:
.: {}
f:containerPort: {}
f:protocol: {}
f:resources: {}
f:terminationMessagePath: {}
f:terminationMessagePolicy: {}
f:dnsPolicy: {}
f:enableServiceLinks: {}
f:imagePullSecrets:
.: {}
k:{"name":"service-account-secret"}: {}
f:restartPolicy: {}
f:schedulerName: {}
f:securityContext: {}
f:shareProcessNamespace: {}
f:terminationGracePeriodSeconds: {}
manager: HashiCorp
operation: Update
time: "2023-09-06T03:03:02Z"
- apiVersion: apps/v1
fieldsType: FieldsV1
fieldsV1:
f:spec:
f:replicas: {}
f:template:
f:spec:
f:nodeSelector: {}
manager: GoogleCloudConsole
operation: Update
time: "2023-09-18T06:15:03Z"
- apiVersion: apps/v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:deployment.kubernetes.io/revision: {}
f:status:
f:conditions:
.: {}
k:{"type":"Available"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
k:{"type":"Progressing"}:
.: {}
f:lastTransitionTime: {}
f:lastUpdateTime: {}
f:message: {}
f:reason: {}
f:status: {}
f:type: {}
f:observedGeneration: {}
f:replicas: {}
f:unavailableReplicas: {}
f:updatedReplicas: {}
manager: kube-controller-manager
operation: Update
subresource: status
time: "2023-09-18T06:28:19Z"
name: sdk-virtual-pet
namespace: sdk-apps
resourceVersion: "11816664"
uid: 981a888d-cbe1-42da-8af9-84961b4aa43b
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: sdk-virtual-pet-app
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: sdk-virtual-pet-app
spec:
automountServiceAccountToken: true
containers:
- env:
- name: API_KEY
valueFrom:
secretKeyRef:
key: API_KEY
name: sdk-virtual-pet
optional: false
- name: BROWSER
valueFrom:
secretKeyRef:
key: BROWSER
name: sdk-virtual-pet
optional: false
- name: IMG_ASSET_ID
valueFrom:
secretKeyRef:
key: IMG_ASSET_ID
name: sdk-virtual-pet
optional: false
- name: INSTANCE_DOMAIN
valueFrom:
secretKeyRef:
key: INSTANCE_DOMAIN
name: sdk-virtual-pet
optional: false
- name: INSTANCE_PROTOCOL
valueFrom:
secretKeyRef:
key: INSTANCE_PROTOCOL
name: sdk-virtual-pet
optional: false
- name: INTERACTIVE_KEY
valueFrom:
secretKeyRef:
key: INTERACTIVE_KEY
name: sdk-virtual-pet
optional: false
- name: INTERACTIVE_SECRET
valueFrom:
secretKeyRef:
key: INTERACTIVE_SECRET
name: sdk-virtual-pet
optional: false
- name: NODE_ENV
valueFrom:
secretKeyRef:
key: NODE_ENV
name: sdk-virtual-pet
optional: false
- name: PORT
valueFrom:
secretKeyRef:
key: PORT
name: sdk-virtual-pet
optional: false
- name: REACT_APP_API_URL
valueFrom:
secretKeyRef:
key: REACT_APP_API_URL
name: sdk-virtual-pet
optional: false
image: us-west1-docker.pkg.dev/topia-gcp/sdk-apps/sdk-virtual-pet
imagePullPolicy: Always
name: sdk-virtual-pet
ports:
- containerPort: 3000
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
enableServiceLinks: true
imagePullSecrets:
- name: service-account-secret
nodeSelector:
kubernetes.io/arch: arm64
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
shareProcessNamespace: false
terminationGracePeriodSeconds: 30
status:
conditions:
- lastTransitionTime: "2023-09-18T05:46:39Z"
lastUpdateTime: "2023-09-18T05:46:39Z"
message: Deployment does not have minimum availability.
reason: MinimumReplicasUnavailable
status: "False"
type: Available
- lastTransitionTime: "2023-09-18T06:28:19Z"
lastUpdateTime: "2023-09-18T06:28:19Z"
message: ReplicaSet "sdk-virtual-pet-bd9f6567c" is progressing.
reason: ReplicaSetUpdated
status: "True"
type: Progressing
observedGeneration: 22
replicas: 2
unavailableReplicas: 2
updatedReplicas: 1

 
kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
gke-topia-gke-arm-node-pool-044e9d2e-lwdk Ready <none> 12d v1.27.3-gke.100 arch=arm,beta.kubernetes.io/arch=arm64,beta.kubernetes.io/instance-type=t2a-standard-2,beta.kubernetes.io/os=linux,cloud.google.com/gke-boot-disk=pd-standard,cloud.google.com/gke-container-runtime=containerd,cloud.google.com/gke-cpu-scaling-level=2,cloud.google.com/gke-logging-variant=DEFAULT,cloud.google.com/gke-max-pods-per-node=110,cloud.google.com/gke-netd-ready=true,cloud.google.com/gke-nodepool=arm-node-pool,cloud.google.com/gke-os-distribution=cos,cloud.google.com/gke-provisioning=standard,cloud.google.com/gke-stack-type=IPV4,cloud.google.com/machine-family=t2a,cloud.google.com/private-node=false,cluster_name=topia-gke,default-node-pool=true,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,iam.gke.io/gke-metadata-server-enabled=true,kubernetes.io/arch=arm64,kubernetes.io/hostname=gke-topia-gke-arm-node-pool-044e9d2e-lwdk,kubernetes.io/os=linux,node.kubernetes.io/instance-type=t2a-standard-2,node_pool=arm-node-pool,topology.gke.io/zone=us-central1-b,topology.kubernetes.io/region=us-central1,topology.kubernetes.io/zone=us-central1-b
gke-topia-gke-arm-node-pool-b1c1fb22-prrs Ready <none> 12d v1.27.3-gke.100 arch=arm,beta.kubernetes.io/arch=arm64,beta.kubernetes.io/instance-type=t2a-standard-2,beta.kubernetes.io/os=linux,cloud.google.com/gke-boot-disk=pd-standard,cloud.google.com/gke-container-runtime=containerd,cloud.google.com/gke-cpu-scaling-level=2,cloud.google.com/gke-logging-variant=DEFAULT,cloud.google.com/gke-max-pods-per-node=110,cloud.google.com/gke-netd-ready=true,cloud.google.com/gke-nodepool=arm-node-pool,cloud.google.com/gke-os-distribution=cos,cloud.google.com/gke-provisioning=standard,cloud.google.com/gke-stack-type=IPV4,cloud.google.com/machine-family=t2a,cloud.google.com/private-node=false,cluster_name=topia-gke,default-node-pool=true,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,iam.gke.io/gke-metadata-server-enabled=true,kubernetes.io/arch=arm64,kubernetes.io/hostname=gke-topia-gke-arm-node-pool-b1c1fb22-prrs,kubernetes.io/os=linux,node.kubernetes.io/instance-type=t2a-standard-2,node_pool=arm-node-pool,topology.gke.io/zone=us-central1-a,topology.kubernetes.io/region=us-central1,topology.kubernetes.io/zone=us-central1-a

 

 

Top Labels in this Space
Top Solution Authors