How to truncate input to TextEmbeddingModel?

mariay1 · 03-01-2024 01:18 PM

I got the error: google.api_core.exceptions.InvalidArgument: 400 Request is too large: 20000 total tokens allowed in a request, 89731 received, and my code looks like below:

model = TextEmbeddingModel.from_pretrained("textembedding-gecko@003")
text_embedding_input = TextEmbeddingInput(
task_type="CLUSTERING", text=some_long_text)
embeddings = model.get_embeddings([text_embedding_input])

https://cloud.google.com/vertex-ai/generative-ai/docs/embeddings/get-text-embeddings#generative-ai-g...
Here it says "Each input text has a token limit of 3,072. Inputs longer than this length are silently truncated. You can also disable silent truncation by setting autoTruncate to false." I didn't set autoTruncate to false, and I also didn't find a place to explicitly set it to true. Why is my input not truncated automatically? And how should I truncate it to within the limit? Thanks!

Poala_Tenorio

According to the documentation you provided, inputs longer than 3,072 tokens should be truncated automatically. However, it appears that this is not happening in your case.

To address this issue, you can manually truncate your input text to ensure it doesn't exceed the token limit. Here's how you can modify your code to truncate the input text:

max_token_limit = 3072 # Maximum token limit allowed

# Truncate the input text if it exceeds the maximum token limit
truncated_text = some_long_text[:max_token_limit]

# Create TextEmbeddingInput with truncated text
text_embedding_input = TextEmbeddingInput(
task_type="CLUSTERING", text=truncated_text)

# Get embeddings
embeddings = model.get_embeddings([text_embedding_input])

By truncating the input text to the maximum token limit, you ensure that it won't exceed the allowed limit, and you won't encounter the "Request is too large" error.