Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Vertex AI pipeline - IndexError: Invalid key: 0 is out of bounds for size 0

Hi, 

I am trying to fine-tune Llama2-7B model using Vertex AI model garden (Google collab). Please find the details below: 

Model used - Llama2-7B  

Fine-tuning method - PEFT  

Number of samples in Training Set - 100  

Number of samples in Eval Set - 20  

Format of the training data -  

 

{"text": "### Human: What is arithmatic mean? ### Assistant: The arithmetic mean, or simply the mean, is the average of a set of numbers obtained by adding them up and dividing by the total count of numbers."}
{"text": "### Human: What is geometric mean? ### Assistant: The geometric mean is a measure of central tendency calculated by multiplying all values in a dataset and then taking the nth root of the product, where n is the total number of values."}

 

 

Vertex pipeline parameters :  

 

pipeline_parameters = {
    "base_model": base_model,
    "dataset_name": dataset_name,
    "prediction_accelerator_type": prediction_accelerator_type,
    "training_accelerator_type": training_accelerator_type,
    "training_precision_mode": training_precision_mode,
    "training_lora_rank": 16,
    "training_lora_alpha": 32,
    "training_lora_dropout": 0.05,
    "training_steps": 20,
    "training_warmup_steps": 10,
    "training_learning_rate": 2e-4,
    "evaluation_steps": 10,
    "evaluation_limit": 1,
}

 

When I execute the training process, I get the below error:  

 

raise IndexError(f"Invalid key: {key} is out of bounds for size {size}")  
IndexError: Invalid key: 0 is out of bounds for size 0

 

Can you please help in understanding the below question? 

1. Is the format of training data correct ? I used the format which was given as default example in Collab notebook, you can find the dataset here

 2. Is the number of samples too less ? 

3. Is there anything I am missing here ? 

 

Thank you,  

KK

0 2 295
2 REPLIES 2