Slow Vertex AI Batch Prediction via TextGeneration... - Page 2

vbt · 01-22-2024 02:06 PM

Hi, I'm using Python with google-cloud-aiplatform==1.39.0.

I am attempting to use TextGenerationModel.batch_predict() with text-bison@002 on JSONL files in GCS. I've put 25k lines in each file and each file is sized anywhere from ~27-40MB.

Unfortunately, I tried to run this on just one file and it took about 7 hours to complete. There doesn't seem to be a way to tweak the Vertex AI platform to improve performance.

I don't fully understand what is happening under the hood so I then tried sending a list of files instead (using the dataset=[] parameter) with portions of the 25k and that actually took longer (~9 hours).

Is this performance to be expected? Is there anything I can do? Considering I have millions of records to process (and would like to run iterative experiments on), this length of time is immediately prohibitive to my use of the product.

The only thing I hadn't tried yet was to use the non-Vertex AI tools to do this work since it seemed like you could tune the machine types.

Any suggestions?

Slow Vertex AI Batch Prediction via TextGenerationModel