Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Vertex AI Resource Exhaustion Error but resources are not even close to exhausted?...

I'm attempting to run a basic pipeline using Kubeflow in Vertex AI. However, when I run it I receive a RESOURCE_EXHAUSTED error in the logs relating to aiplatform.googleapis.com/custom_model_training_cpus.

Checking my quotas, I can see that I am not anywhere close to exhausting any of the quotas under  aiplatform.googleapis.com/custom_model_training_cpus (including the region I'm using - us-central1).

Screenshot 2023-05-04 at 7.17.52 pm.png

Has anyone had a similar issue and know what is going on here?

2 REPLIES 2

And here's the exact error: 

Screenshot 2023-05-04 at 7.25.51 pm.png

Resource exhausted usually refers to the amount of memory used while running the code, not exactly the region quotas. Did you define the machine type in Kubeflow training and deployment, like?

 
training_op = train_model(epochs,).set_cpu_limit('16').set_memory_limit('32G').set_caching_options(False)
 
deploy_op = deploy_model(training_op.outputs["xx"] ,"project","uscentral1").set_cpu_limit('8').set_memory_limit('16G').set_caching_options(False)