Hi! My name is Honey.
I am serving a PyTorch model on Vertex AI. After uploading the model to the model registry as a "model.mar" file, I deployed an endpoint on Vertex AI using a GPU NVIDIA V100 and an n1-standard-16 machine.
When I try to execute the endpoint from my local machine, I get a 503 error.
The endpoint has been successfully deployed and there are no errors on the server.
However, I consistently receive a 503 "InternalServerException".
What could be the issue?
Hi @mdcrewhoney,
Welcome and thank you for reaching out to our community.
I get that you are having issues when trying to execute your endpoint from your local machine. The error code (503) suggests that there is something going on in the backend infrastructure or the model serving environment. There have been reports of multiple google cloud services having issues in our service health summary onset of October but was resolved prior to this post.
There could be a number of reasons why this is happening, for example, the server might be running out of resources by the time you are attempting to execute your endpoint, there could have been a network connectivity issue between your endpoint and local machine like firewalls or internet connectivity, the Vertex AI itself might be experiencing internal errors, etc...
Consider reviewing your log thoroughly as there might be error entries that point to the cause of the problem or get a Vertex AI support for better assistance.