Fine-tuning Codechat model

LeoDevBatler · 12-21-2023 01:15 PM

I have a Python code that utilizes Vertex AI to generate queries based on my data, and it's functioning well. However, I'd like to fine-tune it for improved performance. Currently, I input the table schema and a natural language request.

In an attempt to fine-tune the model, I used the Overview > Idioms > Tune > Create new model option. I provided a jsonl file with around 10 examples of natural language input and corresponding SQL output, keeping all the configurations at default. However, during the ENCODE DATA step, I encountered an error: "[ERROR]: Failed with error: [object of type 'NoneType' has no len()]." I'm unsure how to address this issue, as I simply want to train my model for better responses. Any guidance would be appreciated.

Poala_Tenorio

The error message you're seeing, "[object of type 'NoneType' has no len()]", typically occurs when you're trying to perform a length operation (like len()) on a variable that is of type `None`. In Python, `None` is a special constant that represents the absence of a value or a null value.

When you encounter this error, it means that the code is trying to find the length of an object or variable that is expected to have a length (like a string, list, or other sequence), but it's actually `None` instead of the expected type.

To fix this issue:

Check the variable assignment and ensure that the variable causing the error is assigned a value that is not `None` and is of the expected type.
If the error is occurring within a function or method call, verify that the return value is not unexpected `None`. It's possible that a function is returning `None` instead of the expected data.

StephenMcG

I get the same error when trying to fine-tune a chat model through Vertex AI Studio. I go "Language" > "Tune and Distill" > "Create Tuned Model", and then specify model details and tuning dataset. A pipeline is created, and runs until it gets to the "dataset-encoder" step with the tuning-graph, and then after a few minutes I get this error.

I have initiated a fine-tuning job using the exact same dataset (using the same gs bucket location), and in the same regions, without getting this error using the API through the Python SDK, and specifically the TextGenerationModel.from_pretrained().tune_model() method. Am I missing something in how I set things up on Vertex Studio (the GUI seems pretty intuitive), or is there some difference in the way Studio starts a pipeline and the way it's done through the API?

StephenMcG

I made a mistake in my last post. I get the error described in the original post running a chat-bison tuning job, through either the API or Vertex Studio. I don't get the error running a text-bison job through either method.

So the issue here would seem to have to do with the difference between text and chat models, at least for natural language. Could the chat problem also apply to code chat models? And is there some reason that data that's good for single-turn model training job would cause an error for a multi-turn model job?

StephenMcG

And is there some reason that data that's good for single-turn model training job would cause an error for a multi-turn model job?

It turns out that the answer to this is "yes". Based on this issue, it looks like chat models expect each line of a jsonl file to be a list in this format:

{"messages": [{"author": "user", "content": "what do you do"}, {"author": "assistant", "content": "I assist"}]}

The "author" value for each element of the "messages" list can be either "user" or "assistant"; I can't tell if something akin to a "system" role is also permitted for, for instance, prompts.

As far as I can tell, this isn't made clear in the documentation on tuning a chat model, although I guess it can kind of be intuited from the documentation on inputs to a foundation chat model. So, going back to this from the original post:

Currently, I input the table schema and a natural language request.

Given that you're getting the same error I was with non-chat training data structure, and that it sounds like you, like me, assumed you could just have an input/output type jsonl structure, is it possible you also need to update your data to be in line with the request structure indicate in the basic API documentation for code chat?