I've deployed a container hosting a customized model in Vertex AI. I encounter connection timeout exceptions, particularly when there are 5 or more concurrent requests.
I'm exploring an alternative approach that is cost-effective and capable of autoscaling . Upon attempting to deploy this container in Cloud Run, I'm unable to select a GPU resource. This raises the question: is it feasible to deploy an ML model container into Cloud Run?
Solved! Go to Solution.
Hello @Yash2384 ,
Welcome to the Google Cloud Community!
Currently, GPU units are not yet supported for managed Cloud Run containers. As @knet has mentioned, you can serve an ML model from Cloud Run as long as the model doesn't require GPUs.
Here's a blog post on how you can deploy your ML Model as a Web Service using GCP's Cloud Run.
Additionally, Cloud Run for Anthos does support GPUs. This documentation details how to use NVIDIA GPUs on your instance of Anthos Cloud Run.
Hi, today, you can serve an ML model from Cloud Run as long as the model doesn't require GPUs. We do have customers doing this today.
GPUs are indeed not available in Cloud Run today.
Hello @Yash2384 ,
Welcome to the Google Cloud Community!
Currently, GPU units are not yet supported for managed Cloud Run containers. As @knet has mentioned, you can serve an ML model from Cloud Run as long as the model doesn't require GPUs.
Here's a blog post on how you can deploy your ML Model as a Web Service using GCP's Cloud Run.
Additionally, Cloud Run for Anthos does support GPUs. This documentation details how to use NVIDIA GPUs on your instance of Anthos Cloud Run.