I am using `models/gemini-1.5-pro-latest`
I'm trying to do a multimodal prompt with string and PDF. This is a snippet of my code:
import google.generativeai as genai
model = genai.GenerativeModel(model_name='models/gemini-1.5-pro-latest',
generation_config=generation_config,
safety_settings=safety_settings)
with open("document.pdf", "rb") as pdf:
pdf = pdf.read()
prompt_parts = [ context_text , pdf ]
response = model.generate_content(prompt_parts)
Assume all my configs etc are correct, the example was working fine for only the `context_text`
The error I get is:
---> response = model.generate_content(prompt_parts)
...
/usr/local/lib/python3.10/dist-packages/google/generativeai/types/content_types.py in to_blob(blob)
...
TypeError: Could not create `Blob`, expected `Blob`, `dict` or an `Image` type(`PIL.Image.Image` or `IPython.display.Image`).
Got a: <class 'bytes'>
...
Solved! Go to Solution.
I believe you need to wrap your PDF into a `Part` object.
Something like the following:
import google.generativeai as genai
from vertexai.generative_models import Part
model = genai.GenerativeModel(model_name='models/gemini-1.5-pro-latest',
generation_config=generation_config,
safety_settings=safety_settings)
# NEW SECTION, LOAD INTO PART OBJECT
pdf_file_uri = "gs://path/to/your/document.pdf"
pdf = Part.from_uri(pdf_file_uri, mime_type="application/pdf")
###
prompt_parts = [ context_text , pdf ]
response = model.generate_content(prompt_parts)
I believe you need to wrap your PDF into a `Part` object.
Something like the following:
import google.generativeai as genai
from vertexai.generative_models import Part
model = genai.GenerativeModel(model_name='models/gemini-1.5-pro-latest',
generation_config=generation_config,
safety_settings=safety_settings)
# NEW SECTION, LOAD INTO PART OBJECT
pdf_file_uri = "gs://path/to/your/document.pdf"
pdf = Part.from_uri(pdf_file_uri, mime_type="application/pdf")
###
prompt_parts = [ context_text , pdf ]
response = model.generate_content(prompt_parts)
Great suggestion, however this returned the same error using the same SDK.
I can get it to work if I transition away from `google.generativeai` over to `vertexai.generative_models`. I am a bit confused why there are two nearly identical APIs, guessing I should plan to transition fully to Vertex going forward.
Using the `Parts` from your suggestion this ended up working (with vertexai python library)
import vertexai
from vertexai.generative_models import GenerativeModel, Part
from IPython.display import Markdown
# set configs, etc.
# your recommendation:
pdf_file_uri = f"gs://{gcs_bucket_path}/{slides_pdf}"
document = Part.from_uri(
pdf_file_uri,
mime_type="application/pdf",
)
# migrated to vertexai
# https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models:
vertexai.init(project="my-project-id", location="us-central1")
model = GenerativeModel("gemini-1.5-pro-preview-0409") # model in preview
response = model.generate_content(
[document, "summarize this document"],
generation_config=generation_config,
safety_settings=safety_settings,
stream=False,
)
Markdown(response.text)
User | Count |
---|---|
2 | |
2 | |
1 | |
1 | |
1 |