Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

POD unable to pull large image

I am running a POD that pulls a 12G image. The POD keeps throwing `ImagePullBackOff` error. 

I ssh'ed on the GKE node to which the POD is scheduled. Journalctl on machine node keeps spitting the below message

Dec 20 22:12:27 gke-subsalt-cluster--pool-gpu-nodepoo-687078b8-lp6n containerd[2640]: time="2023-12-20T22:12:27.342460503Z" level=error msg="cancel pulling image gcr.io/xxxxx/subsalt-ray:2023-12-20_7c00fb9 because of no progress in 1m0s"

GKE cluster version: 1.27.4-gke.900 and it is a Standard GKE cluster. it is a private cluster 

Please let me know what other details might be helpful to debug this. 

The other PODs are running fine on the same machine including the PODs in the kube-system namespace. 

 

 

 

0 1 285
1 REPLY 1

Hi @g_munish,

Welcome to the Google Cloud Community!

Currently, there are two open tickets in our Issue Tracker that have the same issue as yours. I'll be monitoring the tickets for updates to determine whether it's an internal issue.

For now, please try the troubleshooting steps provided in this documentation - ImagePullBackOff and ErrImagePull.

You may also try to upgrade the node type where the pod is deployed.

You can always contact Google Cloud Support to further look into your case. Thank you!

Top Labels in this Space