Efficiency and price for fine-tuning Palm

I'd like to fine-tun a chat-bison model using a fairly large dataset with 5,008 lines of training data over 10 epochs, for benchmarking purposes (so the data and run are large, but this can't be changed for what I'm trying to do).

I can successfully use either the Python SDK or Vertex AI Studio to initiate a pipeline, load the data, and start a large-language-model-tuner node.  By specifying "us-central1" for "tuning_job_location" and "tuned_model_location", I assign the job to {"machine_type": "a2-ultragpu-8g"} and {"accelerator_type": "NVIDIA_A100_80GB"} with {"accelerator_count": 8}.  I indicate 6260 training steps, which by my count should work out to 5,008 x 10 epochs of training with 8 cores.  This is all fine.

What I'm trying to figure out is that the job, once initiated, seems like it's going to take a very long time.  Keeping an eye on the logs for the large-language-model-tuner node, it seems like every 50 steps in running for 20-odd minutes.  In addition to this meaning the whole job will take something like 50 hours, there are two things I'm worried about:

  • Is this going to cost a lot?  Looking at the pricing for custom models, an a2-ultragpu-8g costs $46.25/hour, which then gets multiplied by $4.52/hour for the accelerator, which over 50 hours comes to $10,452.50!  I'm not completely clear that I'm building a "custom model" here, but I'd like to be sure of what I'm spending before going too far down this road.
  • If training is slow, then inference will probably also be slow, and model speed is a factor.  Will the model I end up with take a long time to process an input?

So what I'm wondering is, first of all, is there a way to use PEFT for a fine-tuning job through either the SDK or Vertex Studio?  This post suggests that PEFT is automatically applied through Studio, but what I'm seeing makes me think the entire model is being adjusted.

And then have I understood the pricing structure correctly?  Would running this job as-is really cost over ten thousand dollars?

7 3 1,287
3 REPLIES 3

I want to +1 this, I wonder if anyone has an answer because I am experiencing something similar as well

I am wanting to know the answer for the same case. In fact i have data much larger than what is mentioned in the original question and also want to try both text-bison as well as gemini fine-tuning.

Yesterday i was testing Vertex AI fine tuning with a 3KB file (10 examples total). The first try took about 3 hours and failed at the end while creating an endpoint, the very last step.

I tried it agiain with Compute Engine API service account another time which took around 2,5 hours and succeded. This was also with the same .jsonl file of 3 KB in size.
When i check this morning i have noticed that i was charged  $254 for this.
I would like to know why? How is this possbile?

Considering my original data set is 14000+ examples, i realy would like learn how this pricing works...

canu9_0-1714892761715.png