Custom Tune of LLM in Generative AI Studio - train... - Page 2

rubenszmm · 07-21-2023 07:58 PM

Hi,

I want to tune a LLM in Vertex AI Generative Studio (text-bison001 ) and I know it has 137B parameters. I investigated the costs and they are:

Tuning jobs in us-central1 use eight A100 80GB GPUs. Tuning jobs in europe-west4 use 64 cores of the TPU v3 pod custom model training resource, only available upon request. Using a fast calculation, eight A100 80GB will cost 40.22 USD/hour and the TPU V3 64 cores, supposing is the double of 32 cores, will cost 64 USD/hour.

I have a proper JSONL dataset with 1,000 to 52,000 examples and I want to train for 300 epochs.

The issue here is that I need to know how much time usually the tune of text-bison001 takes (1 hour, 10 hours), or at least how many parameters will be tuned, to have an idea about costs involved.

This information is not provided in Vertex AI Pricing, it is not provided in Generative AI Studio Language documentation. Should I consider the regular Vertex AI pricing in calculator ? Maybe this is not the case, as 8 A100 GPUs will be used.

Thanks in advance

Custom Tune of LLM in Generative AI Studio - training time/parameters - Costs