I've integrated an LLM model into the Model Registry using a custom Docker container. The model is hosted correctly, and I can consistently execute prediction requests. However, occasionally I encounter a '503 Service Unavailable' error.
This issue becomes more frequent when I run concurrent requests (around 5 to 6) on the model. I've verified that hardware usage, including GPU and CPU, remains well below 50%. Despite raising a support ticket a month ago, there hasn't been a satisfactory resolution
Is vertex AI service is stable
Solved! Go to Solution.
Hi @Yash2384,
Thank you for joining our community.
I understand this must be frustrating to deal with these "503 Service Unavailable" errors on your Vertex AI model. The "503" error typically indicates the service might be overloaded or unavailable momentarily. These issues usually resolve on their own, but it's understandable that you'd want to find a more permanent solution.
I checked the service health and open issues, but couldn't find anything related to this specific error. While Vertex AI is a generally stable service, errors can sometimes occur due to temporary hiccups within a specific deployment, like yours.
Even though your overall hardware usage seems ok, there could be limitations on specific resources impacting your model. For instance, there might be insufficient resources available in the region you're using. You can check the Vertex AI documentation for details on available locations and their resource configurations.
I hope I was able to provide you with useful insights.
Hi @Yash2384,
Thank you for joining our community.
I understand this must be frustrating to deal with these "503 Service Unavailable" errors on your Vertex AI model. The "503" error typically indicates the service might be overloaded or unavailable momentarily. These issues usually resolve on their own, but it's understandable that you'd want to find a more permanent solution.
I checked the service health and open issues, but couldn't find anything related to this specific error. While Vertex AI is a generally stable service, errors can sometimes occur due to temporary hiccups within a specific deployment, like yours.
Even though your overall hardware usage seems ok, there could be limitations on specific resources impacting your model. For instance, there might be insufficient resources available in the region you're using. You can check the Vertex AI documentation for details on available locations and their resource configurations.
I hope I was able to provide you with useful insights.
What do you mean by "specific deployment of yours". Shall I import the model directly into vertex AI instead of wrapping it into a container.
Will then this issue will get resolved
User | Count |
---|---|
2 | |
2 | |
1 | |
1 | |
1 |