Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

batchPredictionJobs using BigQuery

Hi,

I'm trying to run a batch text prediction job using text-bison mode. I'm following the below instructions:

https://cloud.google.com/vertex-ai/docs/generative-ai/text/batch-prediction-genai

The batch works fine if the format is JSONL and reading from GS location, however for BigQuery, I keep getting the below error:

  • Failed to import data. Not found: Dataset f70b83447d1a80046-tp:llm_bp_tenant_dataset_3263592060199895040 was not found in location US

apparently the IDs here are auto generated as they have nothing to do with my dataset source. Any idea why I'm getting this error?

For reference, My Python code is:

import vertexai
from vertexai.preview.language_models import TextGenerationModel
vertexai.init(project="PROJECT_ID", location="us-central1")
text_model = TextGenerationModel.from_pretrained("text-bison")
batch_prediction_job = text_model.batch_predict(
instances_format="bigquery",
dataset="bq://PROJECT_ID.Dataset_ID.InputTable_ID",
destination_uri_prefix="bq://PROJECT_ID.Dataset_ID.OutputTable_ID",
model_parameters={
"maxOutputTokens": 1000,
"temperature": 0.0,
"topP": 0.95,
"topK": 1,
},
)

print(batch_prediction_job.display_name)
print(batch_prediction_job.resource_name)
print(batch_prediction_job.state)

Solved Solved
0 2 1,439
1 ACCEPTED SOLUTION

It came out that the Dataset is in a multi-region setup. By creating a new dataset with a specific region, the issue was resolved. 

View solution in original post

2 REPLIES 2

I have also tried to use BatchPredictionJob.create but ended up with the same error. for some reason, it's creating a temporary BigQuery dataset, and then  cannot read it!

Code:

import vertexai
from vertexai.preview.language_models import TextGenerationModel
vertexai.init(project="PROJECT_ID", location="us-central1")

import google.cloud.aiplatform as aip
aip.BatchPredictionJob.create(
job_display_name='losing my mind',
model_name= 'publishers/google/models/text-bison',
instances_format='bigquery',
predictions_format='bigquery',
bigquery_source='bq://PROJECT_ID.Dataset_ID.InputTable_ID',
bigquery_destination_prefix='bq://PROJECT_ID.Dataset_ID.OutputTable_ID',
)

It came out that the Dataset is in a multi-region setup. By creating a new dataset with a specific region, the issue was resolved.