Vertex AI pipeline - IndexError: Invalid key: 0 is... - Page 2

kk2105 · 03-26-2024 10:00 AM

Hi,

I am trying to fine-tune Llama2-7B model using Vertex AI model garden (Google collab). Please find the details below:

Model used - Llama2-7B

Fine-tuning method - PEFT

Number of samples in Training Set - 100

Number of samples in Eval Set - 20

Format of the training data -

{"text": "### Human: What is arithmatic mean? ### Assistant: The arithmetic mean, or simply the mean, is the average of a set of numbers obtained by adding them up and dividing by the total count of numbers."}
{"text": "### Human: What is geometric mean? ### Assistant: The geometric mean is a measure of central tendency calculated by multiplying all values in a dataset and then taking the nth root of the product, where n is the total number of values."}

Vertex pipeline parameters :

pipeline_parameters = {
    "base_model": base_model,
    "dataset_name": dataset_name,
    "prediction_accelerator_type": prediction_accelerator_type,
    "training_accelerator_type": training_accelerator_type,
    "training_precision_mode": training_precision_mode,
    "training_lora_rank": 16,
    "training_lora_alpha": 32,
    "training_lora_dropout": 0.05,
    "training_steps": 20,
    "training_warmup_steps": 10,
    "training_learning_rate": 2e-4,
    "evaluation_steps": 10,
    "evaluation_limit": 1,
}

When I execute the training process, I get the below error:

raise IndexError(f"Invalid key: {key} is out of bounds for size {size}")  
IndexError: Invalid key: 0 is out of bounds for size 0

Can you please help in understanding the below question?

1. Is the format of training data correct ? I used the format which was given as default example in Collab notebook, you can find the dataset here

2. Is the number of samples too less ?

3. Is there anything I am missing here ?

Thank you,

KK

Vertex AI pipeline - IndexError: Invalid key: 0 is out of bounds for size 0