I'd like to fine-tun a chat-bison model using a fairly large dataset with 5,008 lines of training data over 10 epochs, for benchmarking purposes (so the data and run are large, but this can't be changed for what I'm trying to do).
I can successfully use either the Python SDK or Vertex AI Studio to initiate a pipeline, load the data, and start a large-language-model-tuner node. By specifying "us-central1" for "tuning_job_location" and "tuned_model_location", I assign the job to {"machine_type": "a2-ultragpu-8g"} and {"accelerator_type": "NVIDIA_A100_80GB"} with {"accelerator_count": 8}. I indicate 6260 training steps, which by my count should work out to 5,008 x 10 epochs of training with 8 cores. This is all fine.
What I'm trying to figure out is that the job, once initiated, seems like it's going to take a very long time. Keeping an eye on the logs for the large-language-model-tuner node, it seems like every 50 steps in running for 20-odd minutes. In addition to this meaning the whole job will take something like 50 hours, there are two things I'm worried about:
So what I'm wondering is, first of all, is there a way to use PEFT for a fine-tuning job through either the SDK or Vertex Studio? This post suggests that PEFT is automatically applied through Studio, but what I'm seeing makes me think the entire model is being adjusted.
And then have I understood the pricing structure correctly? Would running this job as-is really cost over ten thousand dollars?