Recommended resources for Llama2 models (7B, 13B and 70B) from Vertex AI Model Garden

Hi, as I search through Internet, I see users have problems with setting correct resources for Llama2. Is there anybody who could share their working setups?

Currently, I'm trying to deploy 13B model on n1-standard-4, with one Tesla V100 accelerator without luck (still getting timeout and no info in logs). 7B worked fine on this setup.

2 REPLIES 2

Vertex's recommended approach is to use A100 40GB. I don't think a V100 GPU would be sufficient for 13B LLama2

Run