Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Training pipeline failed with error message: Internal error occurred. Please retry in a few minutes.

I'm getting this error: "Training pipeline failed with error message: Internal error occurred. Please retry in a few minutes."  training a Vertex AI video object tracking model.
I have used the same dataset to train an AutoML model and it went fine without any errors. Dataset is correct.
But when I try to train the same dataset but with AutoML Edge Model using generic setting and I'm getting this error. I have tried checking "logging" page to see the error logs, but there are no error logs. I'm trying to find a solution, but I don't even know where it goes wrong.

I have tried using a different server, didn't help.
Also tried with another dataset (much smaller one) didn't work, that one also failed.
I tried it 4 times and all of them even the small dataset one takes around 8 hours and 50 minutes roughly and fails.

Any help is appreciated. 

1 REPLY 1

Hi @kursadoz,

Welcome to Google Cloud Community!

The error message "Training pipeline failed with error message: Internal error occurred. Please retry in a few minutes." you have encountered indicates that there might be an issue on the server side or with the back end systems. 

Given that you have already done some workarounds such as using a different server , training another dataset, and checking error logs but none of this work,you may also try this possible troubleshooting tips:  

  • Refer to this documentation to troubleshoot Google Cloud console page loading issues.
  • Check Quotas and Limits - Ensure that you have not exceeded any quotas for Vertex AI, such as training time or resources. 
  • Inspect Data Set and Configuration - Check if your dataset is in correct format and if it follows these  steps on Training an AutoML Edge model
  • Examine Training Logs - since you’ve mentioned checking the error logs, you can also check other logs on Vertex AI for other additional messages or warnings that might provide more information about the error.

If the error persists, I suggest contacting Google Cloud Support as they can provide more insights to see if the error you've encountered is a known issue or specific to your project.

I hope the above information is helpful.