Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Correct JSONL format for Vertex AI training

Tried to upload data in JSONL format in Vertex AI, based on the example given. It still gives an error - 

ValueError: Cannot decode training example at line 0: {"input_text": "I'm feeling great today! My name is Sarah, I'm 28 years old, and I'm female.", "output_text": "Oh thats's awesome! Let's do something fun then. Wanna do a quick survey on smartphones!"}

What might be the issue here?

Here's my input jsonl file for reference:

{"input_text": "I'm feeling great today! My name is Sarah, I'm 28 years old, and I'm female.", "output_text": "Oh thats's awesome! Let's do something fun then. Wanna do a quick survey on smartphones!"}
{"input_text": "Today has been okay so far. I'm John, 35 years old, and male.", "output_text": "That's life! Are you free for a quick survey about smartphones?"}
{"input_text": "To be honest, I'm not feeling my best today. I'm Emma, 42 years old, and female.", "output_text": "Oh that's alright. You always have tomorrow! Anyway, why not do a quick chat with me about smartphones? Who knows, it might make your day!"}
{"input_text": "I'm feeling fantastic today! It's been a productive and enjoyable day so far. My name is Michael, I'm 23 years old, and I identify as male.", "output_text": "Oh thats's awesome! Let's do something fun then. Wanna do a quick survey on smartphones!"}
{"input_text": "I'm feeling neutral today, neither great nor bad. I'm Alex, 19 years old, and male.", "output_text": "That's life! Are you free for a quick survey about smartphones?"}
{"input_text": "I'm not feeling too good today. I've been dealing with some stress, and it's affecting my mood. My name is Emily, I'm 30 years old, and female.", "output_text": "Oh that's alright. You always have tomorrow! Anyway, why not do a quick chat with me about smartphones? Who knows, it might make your day!"}
{"input_text": "I'm feeling quite happy and content today. The weather is nice, and I had a good start to the day. I'm David, 45 years old, and male.", "output_text": "Oh thats's awesome! Let's do something fun then. Wanna do a quick survey on smartphones!"}
{"input_text": "I'm feeling alright today, nothing out of the ordinary. My name is Lisa, I'm 27 years old, and female.", "output_text": "That's life! Are you free for a quick survey about smartphones?"}
{"input_text": "I'm feeling a bit down today. Nothing major, just a bit tired. I'm Ryan, 32 years old, and male.", "output_text": "Oh that's alright. You always have tomorrow! Anyway, why not do a quick chat with me about smartphones? Who knows, it might make your day!"}
{"input_text": "Today has been a good day overall. I had a fun outing with friends, and it lifted my spirits. I'm Jessica, 20 years old, and female.", "output_text": "Oh thats's awesome! Let's do something fun then. Wanna do a quick survey on smartphones!"}
{"input_text": "Absolutely! I find smartphones fascinating.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Sure, I don't mind participating in the survey", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Sorry, I'm not interested in doing a survey right now.", "output_text": "That's alright. Thanks for your time!"}
{"input_text": "Count me in! I enjoy discussing smartphones.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Okay, I can spare a few minutes for the survey.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Yes, I'd love to share my thoughts on smartphones.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Sure, I guess I can answer a few questions.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Not really interested", "output_text": "That's alright. Thanks for your time!"}
{"input_text": "Definitely! Smartphones are a topic I'm passionate about.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "I suppose I could participate in the survey.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Absolutely! I find smartphones fascinating.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Sure, I don't mind participating in the survey", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Sorry, I'm not interested in doing a survey right now.", "output_text": "That's alright. Thanks for your time!"}
{"input_text": "Count me in! I enjoy discussing smartphones.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Okay, I can spare a few minutes for the survey.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Yes, I'd love to share my thoughts on smartphones.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Sure, I guess I can answer a few questions.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Not really interested, but I'll do it anyway.", "output_text": "That's alright. Thanks for your time!"}
{"input_text": "Definitely! Smartphones are a topic I'm passionate about.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "I suppose I could participate in the survey.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Absolutely! I find smartphones fascinating.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Sure, I don't mind participating in the survey", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Sorry, I'm not interested in doing a survey right now.", "output_text": "That's alright. Thanks for your time!"}
{"input_text": "Count me in! I enjoy discussing smartphones.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Okay, I can spare a few minutes for the survey.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Yes, I'd love to share my thoughts on smartphones.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Sure, I guess I can answer a few questions.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Not really interested, but I'll do it anyway.", "output_text": "That's alright. Thanks for your time!"}
{"input_text": "Definitely! Smartphones are a topic I'm passionate about.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "I suppose I could participate in the survey.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Today, I can't find anything to criticize about my smartphone; my camera takes stunning photos!", "output_text": "That's wonderful! Do you want any particular feature improved in the next release?"}
{"input_text": "The design of my current smartphone looks sleek and stylish.", "output_text": "That's wonderful! Do you want any particular feature improved in the nxt release?"}
{"input_text": "The ergonomics of my phone are excellent; it fits comfortably in my hand.", "output_text": "That's wonderful! Do you want any particular feature improved in the nxt release?"}
{"input_text": "Battery life has been surprisingly impressive lately; I'm quite satisfied with it.", "output_text": "That's wonderful! Do you want any particular feature improved in the nxt release?"}
{"input_text": "The performance of my smartphone has been smooth and efficient; it's working great.", "output_text": "That's wonderful! Do you want any particular feature improved in the nxt release?"}
{"input_text": "I don't have anything specific to say about my smartphone's camera; it's decent for my needs.", "output_text": "Understood. Do you think the camera feature is value for money?"}
{"input_text": "The design is okay; it's not extraordinary, but I don't have any major issues with it.", "output_text": "Understood. Do you think the design doesn't stand the test of time?"}
{"input_text": "The ergonomics are acceptable; it could be better, but it's not a significant concern.", "output_text": "Understood. Do you think the ergonomics of the phone can be improved?"}
{"input_text": "Battery life is average; it gets me through the day, but there's room for improvement.", "output_text": "Understood. Do you think the battery capacity feature should be increased in the next release?"}
{"input_text": "Performance is fine overall; there are occasional lags, but it's manageable.", "output_text": "Understood. Do you think the performance is just right for the value of the phone?"}
{"input_text": "The camera on my smartphone disappoints me; it struggles in low-light conditions.", "output_text": "Sorry to hear that. Are there any changes you would like to suggest?"}
{"input_text": "I'm not a fan of the design; it feels outdated and lacks innovation.", "output_text": "Sorry to hear that. Are there any changes you would like to suggest?"}
{"input_text": "The ergonomics are bothering me today; the phone feels uncomfortable during extended use.", "output_text": "Sorry to hear that. Are there any changes you would like to suggest?"}
{"input_text": "Battery life is quite poor; it drains quickly, and I'm frustrated with it.", "output_text": "Sorry to hear that. Are there any changes you would like to suggest?"}
{"input_text": "The performance is subpar; the phone is sluggish and struggles with multitasking.", "output_text": "Sorry to hear that. Are there any changes you would like to suggest?"}

3 REPLIES 3

I am getting an error on validate pipeline step. See if you can help this is the error: Screenshot 2023-08-14 174824.png

based on the limited information I see here, I assume you do not have enough resources to run the training job. Refer to this document and proceed. Also, make sure your instance is in us-central1:

Tune language foundation models  |  Vertex AI  |  Google Cloud

The documentation says 2 things:

- the JSONL must be configured as "records"

- you must have only one example per row

Check if this works for you:

data_list = df.to_dict(orient='records')
with open('output_alpaca.jsonl', 'w') as file:
    for example in data_list:
        file.write(json.dumps(example) + '\n')