Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

"503 Server Error: Service Unavailable exception" on service hosted on vertex AI

 

I've integrated an LLM model into the Model Registry using a custom Docker container. The model is hosted correctly, and I can consistently execute prediction requests. However, occasionally I encounter a '503 Service Unavailable' error.

This issue becomes more frequent when I run concurrent requests (around 5 to 6) on the model. I've verified that hardware usage, including GPU and CPU, remains well below 50%. Despite raising a support ticket a month ago, there hasn't been a satisfactory resolution

Is vertex AI service is stable 

Solved Solved
0 2 11.3K
1 ACCEPTED SOLUTION

Hi @Yash2384

Thank you for joining our community.

I understand this must be frustrating to deal with these "503 Service Unavailable" errors on your Vertex AI model. The "503" error typically indicates the service might be overloaded or unavailable momentarily. These issues usually resolve on their own, but it's understandable that you'd want to find a more permanent solution.

I checked the service health and open issues, but couldn't find anything related to this specific error. While Vertex AI is a generally stable service, errors can sometimes occur due to temporary hiccups within a specific deployment, like yours. 

Even though your overall hardware usage seems ok, there could be limitations on specific resources impacting your model. For instance, there might be insufficient resources available in the region you're using. You can check the Vertex AI documentation for details on available locations and their resource configurations.

I hope I was able to provide you with useful insights.

 

 

View solution in original post

2 REPLIES 2

Hi @Yash2384

Thank you for joining our community.

I understand this must be frustrating to deal with these "503 Service Unavailable" errors on your Vertex AI model. The "503" error typically indicates the service might be overloaded or unavailable momentarily. These issues usually resolve on their own, but it's understandable that you'd want to find a more permanent solution.

I checked the service health and open issues, but couldn't find anything related to this specific error. While Vertex AI is a generally stable service, errors can sometimes occur due to temporary hiccups within a specific deployment, like yours. 

Even though your overall hardware usage seems ok, there could be limitations on specific resources impacting your model. For instance, there might be insufficient resources available in the region you're using. You can check the Vertex AI documentation for details on available locations and their resource configurations.

I hope I was able to provide you with useful insights.

 

 

What do you mean by "specific deployment of yours". Shall I import the model directly into vertex AI  instead of wrapping it into a container. 

Will then this issue will get resolved