Re: Correct JSONL format for Vertex AI training

midmurali57 · 08-14-2023 04:48 AM

Tried to upload data in JSONL format in Vertex AI, based on the example given. It still gives an error -

ValueError: Cannot decode training example at line 0: {"input_text": "I'm feeling great today! My name is Sarah, I'm 28 years old, and I'm female.", "output_text": "Oh thats's awesome! Let's do something fun then. Wanna do a quick survey on smartphones!"}

What might be the issue here?

Here's my input jsonl file for reference:

{"input_text": "I'm feeling great today! My name is Sarah, I'm 28 years old, and I'm female.", "output_text": "Oh thats's awesome! Let's do something fun then. Wanna do a quick survey on smartphones!"}
{"input_text": "Today has been okay so far. I'm John, 35 years old, and male.", "output_text": "That's life! Are you free for a quick survey about smartphones?"}
{"input_text": "To be honest, I'm not feeling my best today. I'm Emma, 42 years old, and female.", "output_text": "Oh that's alright. You always have tomorrow! Anyway, why not do a quick chat with me about smartphones? Who knows, it might make your day!"}
{"input_text": "I'm feeling fantastic today! It's been a productive and enjoyable day so far. My name is Michael, I'm 23 years old, and I identify as male.", "output_text": "Oh thats's awesome! Let's do something fun then. Wanna do a quick survey on smartphones!"}
{"input_text": "I'm feeling neutral today, neither great nor bad. I'm Alex, 19 years old, and male.", "output_text": "That's life! Are you free for a quick survey about smartphones?"}
{"input_text": "I'm not feeling too good today. I've been dealing with some stress, and it's affecting my mood. My name is Emily, I'm 30 years old, and female.", "output_text": "Oh that's alright. You always have tomorrow! Anyway, why not do a quick chat with me about smartphones? Who knows, it might make your day!"}
{"input_text": "I'm feeling quite happy and content today. The weather is nice, and I had a good start to the day. I'm David, 45 years old, and male.", "output_text": "Oh thats's awesome! Let's do something fun then. Wanna do a quick survey on smartphones!"}
{"input_text": "I'm feeling alright today, nothing out of the ordinary. My name is Lisa, I'm 27 years old, and female.", "output_text": "That's life! Are you free for a quick survey about smartphones?"}
{"input_text": "I'm feeling a bit down today. Nothing major, just a bit tired. I'm Ryan, 32 years old, and male.", "output_text": "Oh that's alright. You always have tomorrow! Anyway, why not do a quick chat with me about smartphones? Who knows, it might make your day!"}
{"input_text": "Today has been a good day overall. I had a fun outing with friends, and it lifted my spirits. I'm Jessica, 20 years old, and female.", "output_text": "Oh thats's awesome! Let's do something fun then. Wanna do a quick survey on smartphones!"}
{"input_text": "Absolutely! I find smartphones fascinating.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Sure, I don't mind participating in the survey", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Sorry, I'm not interested in doing a survey right now.", "output_text": "That's alright. Thanks for your time!"}
{"input_text": "Count me in! I enjoy discussing smartphones.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Okay, I can spare a few minutes for the survey.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Yes, I'd love to share my thoughts on smartphones.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Sure, I guess I can answer a few questions.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Not really interested", "output_text": "That's alright. Thanks for your time!"}
{"input_text": "Definitely! Smartphones are a topic I'm passionate about.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "I suppose I could participate in the survey.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Absolutely! I find smartphones fascinating.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Sure, I don't mind participating in the survey", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Sorry, I'm not interested in doing a survey right now.", "output_text": "That's alright. Thanks for your time!"}
{"input_text": "Count me in! I enjoy discussing smartphones.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Okay, I can spare a few minutes for the survey.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Yes, I'd love to share my thoughts on smartphones.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Sure, I guess I can answer a few questions.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Not really interested, but I'll do it anyway.", "output_text": "That's alright. Thanks for your time!"}
{"input_text": "Definitely! Smartphones are a topic I'm passionate about.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "I suppose I could participate in the survey.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Absolutely! I find smartphones fascinating.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Sure, I don't mind participating in the survey", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Sorry, I'm not interested in doing a survey right now.", "output_text": "That's alright. Thanks for your time!"}
{"input_text": "Count me in! I enjoy discussing smartphones.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Okay, I can spare a few minutes for the survey.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Yes, I'd love to share my thoughts on smartphones.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Sure, I guess I can answer a few questions.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Not really interested, but I'll do it anyway.", "output_text": "That's alright. Thanks for your time!"}
{"input_text": "Definitely! Smartphones are a topic I'm passionate about.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "I suppose I could participate in the survey.", "output_text": "Wonderful. So here is the first question: Which features of a smartphone would you like to talk about today?"}
{"input_text": "Today, I can't find anything to criticize about my smartphone; my camera takes stunning photos!", "output_text": "That's wonderful! Do you want any particular feature improved in the next release?"}
{"input_text": "The design of my current smartphone looks sleek and stylish.", "output_text": "That's wonderful! Do you want any particular feature improved in the nxt release?"}
{"input_text": "The ergonomics of my phone are excellent; it fits comfortably in my hand.", "output_text": "That's wonderful! Do you want any particular feature improved in the nxt release?"}
{"input_text": "Battery life has been surprisingly impressive lately; I'm quite satisfied with it.", "output_text": "That's wonderful! Do you want any particular feature improved in the nxt release?"}
{"input_text": "The performance of my smartphone has been smooth and efficient; it's working great.", "output_text": "That's wonderful! Do you want any particular feature improved in the nxt release?"}
{"input_text": "I don't have anything specific to say about my smartphone's camera; it's decent for my needs.", "output_text": "Understood. Do you think the camera feature is value for money?"}
{"input_text": "The design is okay; it's not extraordinary, but I don't have any major issues with it.", "output_text": "Understood. Do you think the design doesn't stand the test of time?"}
{"input_text": "The ergonomics are acceptable; it could be better, but it's not a significant concern.", "output_text": "Understood. Do you think the ergonomics of the phone can be improved?"}
{"input_text": "Battery life is average; it gets me through the day, but there's room for improvement.", "output_text": "Understood. Do you think the battery capacity feature should be increased in the next release?"}
{"input_text": "Performance is fine overall; there are occasional lags, but it's manageable.", "output_text": "Understood. Do you think the performance is just right for the value of the phone?"}
{"input_text": "The camera on my smartphone disappoints me; it struggles in low-light conditions.", "output_text": "Sorry to hear that. Are there any changes you would like to suggest?"}
{"input_text": "I'm not a fan of the design; it feels outdated and lacks innovation.", "output_text": "Sorry to hear that. Are there any changes you would like to suggest?"}
{"input_text": "The ergonomics are bothering me today; the phone feels uncomfortable during extended use.", "output_text": "Sorry to hear that. Are there any changes you would like to suggest?"}
{"input_text": "Battery life is quite poor; it drains quickly, and I'm frustrated with it.", "output_text": "Sorry to hear that. Are there any changes you would like to suggest?"}
{"input_text": "The performance is subpar; the phone is sluggish and struggles with multitasking.", "output_text": "Sorry to hear that. Are there any changes you would like to suggest?"}

kishantripathi

I am getting an error on validate pipeline step. See if you can help this is the error:

midmurali57

based on the limited information I see here, I assume you do not have enough resources to run the training job. Refer to this document and proceed. Also, make sure your instance is in us-central1:

Tune language foundation models | Vertex AI | Google Cloud

rubenszmm

The documentation says 2 things:

- the JSONL must be configured as "records"

- you must have only one example per row

Check if this works for you:

data_list = df.to_dict(orient='records')
with open('output_alpaca.jsonl', 'w') as file:
    for example in data_list:
        file.write(json.dumps(example) + '\n')