Does Vertex AI support multiple model instances in...

sarkaramal · 09-13-2021 04:57 AM

We are trying to deploy the model in Vertex Endpoint with GPU support.
Here we are facing two problems, GPU memory is fully reserved by a single model but GPU power

is underutilize.

So can we deploy multiple Workers in the Same Node and also how to allow the worker to reserve VRAM only up to it required?

maggielynn

You can deploy more than one model to the same endpoint (documentation), however the resources are associated with the model rather than the endpoint.

Does Vertex AI support multiple model instances in Same Endpoint Node.