Hi there !
I'm running into the following quota issue when performing predictions with Vertex AI's text models:
Failed to run inference job. Exceeded rate limits: too many concurrent queries that use ML.GENERATE_TEXT table-valued function for this project
import vertexai
from vertexai.preview.language_models import TextGenerationModel
vertexai.init(project=project_id, location="us-central1", credentials=creds)
parameters = {
"temperature": 0.2,
"max_output_tokens": 2048,
"top_p": 0.2,
"top_k": 8
}
model = TextGenerationModel.from_pretrained("text-bison")
dataset = f"gs://{US_BUCKET_NAME}/metadata/{path}/{file_name}_{file_id}_prompts*"
destination_uri = f"gs://{US_BUCKET_NAME}/metadata/{path}/{file_name}_{file_id}_output"
response = model.batch_predict(destination_uri_prefix=destination_uri, dataset=dataset)
Facing this issue as well, but having this error come from BigQuery
Same here in bigquery, someone know how to solve this?
Same here in bigquery...
Job exceeded rate limits: Your project exceeded quota for concurrent queries that use ML.GENERATE_TEXT_EMBEDDING table-valued function
Same here, any news?
User | Count |
---|---|
2 | |
2 | |
1 | |
1 | |
1 |