I have Cloud Run enabled with 3 GPUs, which AI mod...

asrivas

Hi everyone,

I have Cloud Run with GPU access enabled in my Google Cloud project. Which new AI models fit within the Run resource constraints and are there any reasoning models that will work?

dhirajpatra

You can check here

donmccasland

This article is great. Thanks for sharing @dhirajpatra . I would add that now there are a set of Gemma3 models that will fit quite nicely in a single L4's 24GB of memory.

dhirajpatra

Kindly share with us what was your steps and how it is running @donmccasland

Thank you

donmccasland

Here ya go! https://cloud.google.com/run/docs/run-gemma-on-cloud-run

anh1

Gemma 2B

dhirajpatra

You can check few of my repo to run on local as well as possible in cloud.

https://github.com/dhirajpatra/langgraph-multi-agent

https://github.com/dhirajpatra/private_kb_rag_graph_ollama

I have Cloud Run enabled with 3 GPUs, which AI models should I run?