I can't find any examples online of how an input jsonl is supposed to look for a batch training job. When I tried with this:
{'instances': {'mimeType': 'text/plain', 'content': '0'}} {'instances': {'mimeType': 'text/plain', 'content': '1'}} {'instances': {'mimeType': 'text/plain', 'content': '1'}} {'instances': {'mimeType': 'text/plain', 'content': '2'}} {'instances': {'mimeType': 'text/plain', 'content': '8'}} {'instances': {'mimeType': 'text/plain', 'content': 'a'}} {'instances': {'mimeType': 'text/plain', 'content': 'a'}} {'instances': {'mimeType': 'text/plain', 'content': 'h'}} {'instances': {'mimeType': 'text/plain', 'content': 'q'}} {'instances': {'mimeType': 'text/plain', 'content': 's'}} {'instances': {'mimeType': 'text/plain', 'content': 'y'}} {'instances': {'mimeType': 'text/plain', 'content': 'y'}}
I got an error email saying
Error Messages: BatchPrediction could not start because no valid instances
were found in the input file.
Is there some other way this should look for it to work? Maybe like
{
'instances': [
{'mimeType': 'text/plain', 'content': 'a'}
{'mimeType': 'text/plain', 'content': 'bc'}
{'mimeType': 'text/plain', 'content': 'de'}]
}
Hell sangersteel,
It is not possible to use a JSONL file for batch prediction of text classification. Only a CSV file format is accepted for text classification. This is indicated in the [1] [AutoML Natural Language documentation] The CSV file should only contain 1 file (input file) per row. The CSV file and each input file needs to be stored in your Cloud Storage bucket.
[1] https://cloud.google.com/natural-language/automl/docs/predict#batch_prediction.