Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Issues getting batch prediction results from model deployed from model garden

I've deployed BioGPT to an endpoint and I'm using it to try to get some text response batch predictions, to no avail.

 

I am not seeing any logging errors, even though the job failed with every prompt failing. The error message I get is cryptic and just this:

('Post request fails. Cannot get predictions. Error: Exceeded retries: Non-OK result 500 ({\n  "code": 500,\n  "type": "InternalServerException",\n  "message": "Worker died."\n}\n) from server, retry=3, ellapsed=56.66s.', 64)
('Post request fails. Cannot get predictions. Error: Exceeded retries: Non-OK result 503 (no healthy upstream) from server, retry=3, ellapsed=0.01s.', 48656)
('Post request fails. Cannot get predictions. Error: Exceeded retries: Non-OK result 503 (no healthy upstream) from server, retry=3, ellapsed=0.02s.', 1216)
('Post request fails. Cannot get predictions. Error: Exceeded retries: Non-OK result 500 ({\n  "code": 500,\n  "type": "InternalServerException",\n  "message": "Worker died."\n}\n) from server, retry=3, ellapsed=56.62s.', 64)
 

What is the problem? I've changed the number of samples, the length of the prompts, etc. 

 

5 2 431
2 REPLIES 2

Of what i understand, the health endpoint is not available while doing a prediction, so if the prediction takes too long, the automatic health checks will fail. Where you able to solve your problem? how?

When deploying and using BioGPT, encountering errors can be frustrating, especially when the error messages are cryptic. Based on the details shared, it appears that the issues stem from server-side problems with your endpoint. The error messages, including "500 Internal Server Error" and "503 Service Unavailable," indicate two main challenges: worker crashes and unavailability of healthy upstream connections.