Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

How to implement exponential backoff with gemini? too many 504s

Hi,

I am encountering too many 504 deadline exceeded messages -- maybe half of my requests.

I would like to implement exponential backoff.  There is an @backoff function decorator available in OpenAI.  How do I achieve this in gemini?

Using google-generativeai v 0.62, gemini pro 1.5.

Just something to surround:

try:
response = model.generate_content(prompt)
except Exception as e:
errormsg = traceback.print_exc()
logging.error(f"error generating content: {errormsg}")
0 6 2,782
6 REPLIES 6

`generate_content` takes `request_options` as an argument 

from google.api_core import retry

response =model. generate_content (content, request_options={'retry':retry.Retry()})

I ended up building one myself. It is just few lines of code anyway.

sleep_attempts = 0
sleep_time = 2
while True:
try:
response = self.model.generate_content(contents)
return response.text
except ResourceExhausted as re:
print(f"ResourceExhausted exception occurred while processing property: {re}")
sleep_attempts += 1
if sleep_attempts > 5:
print(f"ResourceExhausted exception occurred 5 times in a row. Exiting.")
break
time.sleep(sleep_time)
sleep_time *= 2

return "Error occurred while processing the request"

Is this issue resolved? I tried both the above snippets none of them worked for me.
I got my error while using the embedding model. Thanks in advance

GoogleGenerativeAIError: Error embedding content: 504 Deadline Exceeded

my code block:

def create_subject_vector_db(sub_name):

    chunks_docs = load_and_chunk(f"/content/{sub_name}")

    CHROMA_DB_DIRECTORY = f'/content/CHROMA_DB_DIRECTORY_{sub_name}'

 

    embedding = GoogleGenerativeAIEmbeddings(model="models/embedding-001", google_api_key=google_api_key, request_options={'retry':retry.Retry()})

    vector_db = Chroma.from_documents(documents=chunks_docs, embedding=embedding, persist_directory=CHROMA_DB_DIRECTORY)

    vector_db.persist()

    retriever = vector_db.as_retriever()

   

    return retriever

The method

from google.api_core import retry

was described as helping with `generate_content()`. The new 504 is indicating a gateway timeout between components in the data center doing embeddings. Retry is not going to be helpful in that scenario. 

What is the solution to this?
I have tried splitting the load i.e. chunks into more smaller batches, tried the retry method with Exponential Backoff, also tried it running from another deivce (assuming it is a network error)

I has the same problem, but when i passed credentials by parameter.. when i passed credentials by environment my code worked

Bug:

embedding = GoogleGenerativeAIEmbeddings(model="models/embedding-001", google_api_key=google_api_key) 

It's work for me:

from google.colab import userdata
import os
os.environ["GOOGLE_API_KEY"] = userdata.get('GOOGLE_API_KEY')

embedding = GoogleGenerativeAIEmbeddings(model="models/embedding-001")