Hello everyone I am trying to understand running Google AI models on 10 Nvidia DGX Spark cluster
Hey pupp,
Can you tell us more? Which models? Have you tried the quickstarts for either Titan Inference Server or NVIDIA NIM Microservices?
Hi pupp,
As @donmccasland mentioned...are you able to share more details? Have you checked out the NVIDIA Dynamo platform? Includes a getting started guide. It's a low-latency platform that serves all AI models across frameworks, architecture, or deployment scale. Coupled with Vertex AI or Google Kubernetes Engine may be what you're looking for.