Solved: Build a Multi-Modal GenAI Application Task 2 Issue - Page 2

thinkverse · 04-17-2025 08:51 PM

In Build a Multi-Modal GenAI Application: Challenge Lab, I cannot get the second task to trigger. The task says ;

"Develop a second Python function called analyze_bouquet_image(image_path). This function will take the image path as input along with a text prompt to generate birthday wishes based on the image passed and send it to the gemini-2.0-flash-001 model. To ensure responses can be obtained as and when they are generated, enable streaming on the prompt requests."

I faced several issues with this task. When I first tried using from google import genai I got an error saying Python couldn't find genai, so I had to install it, when that was solved I tried checking my progress and it said I needed to wait for the logs to generate, so I added the same logging code used in the previous labs. Now that didn't work at first given the same issues as genai with Python saying it couldn't find google.cloud, so I had to again install it. Installing it with the following command solved those issues at least.

/usr/bin/pip3 install --upgrade google-genai google-cloud-logging

So I've tried building the function with both genai and with vertexai, both stream their result and do get the image and prompt.

genai version

from google import genai
from google.genai.types import HttpOptions, Part, Image

def analyze_bouquet_image(image_path: str):
    client = genai.Client(
        vertexai=True,
        project='PROJECT_ID',
        location='REGION',
        http_options=HttpOptions(api_version="v1")
    )
    
    chat = client.chats.create(model="gemini-2.0-flash-001")
    
    messages = [
        "Generate birthday wishes based on the provided image",
        Part.from_bytes(data=Image.from_file(location=image_path).image_bytes, mime_type="image/jpeg")
    ]

    for chunk in chat.send_message_stream(messages):
        print(chunk.text, end="")

analyze_bouquet_image(
    image_path='image.jpeg'
)

vertexai version

import vertexai
from vertexai.generative_models import GenerativeModel, Part, Image

import logging
from google.cloud import logging as gcp_logging

# ------  Below cloud logging code is for Qwiklab's internal use, do not edit/remove it. --------
# Initialize GCP logging
gcp_logging_client = gcp_logging.Client()
gcp_logging_client.setup_logging()

def analyze_bouquet_image(image_path: str):
    vertexai.init(
        project='PROJECT_ID',
        location='REGION',
    )
    
    multimodal_model = GenerativeModel("gemini-2.0-flash-001")
    
    messages = [
        "Generate birthday wishes based on the provided image",
        Part.from_image(Image.load_from_file(location=image_path))
    ]

    for chunk in multimodal_model.generate_content(contents=messages, stream=True):
        print(chunk.text, end="")

analyze_bouquet_image(
    image_path='image.jpeg'
)

So both versions of the function work, the signature matches, they call the right model, and both stream their responses. I think the issue comes with the logging, which I omitted above, but I use the same Qwiklab's internal GCP logging code from earlier labs.

import logging
from google.cloud import logging as gcp_logging

# ------  Below cloud logging code is for Qwiklab's internal use, do not edit/remove it. --------
# Initialize GCP logging
gcp_logging_client = gcp_logging.Client()
gcp_logging_client.setup_logging()

I'm wondering if I'm missing something in the setup. I've tried logging the streamed response, but I get a similar message to "failed to save log" in the console. When I don't try to manually log something, that error doesn't appear.

Does anyone have any ideas about what the issue might be? It could be staring me in the eyes, and I'm just missing it. Any help would be appreciated.

thinkverse

Thank you for trying to help @RAOKS. I finally solved it, even though the challenge prompt states: "To ensure responses can be obtained as and when they are generated, enable streaming on the prompt requests."

What finally solved Task #2 for me was to use multimodal_model.start_chat() and set send_message(stream=False). Which, given the prompt, shouldn't be right since streaming isn't enabled. 🤷‍♂️

If anyone else is having issues, then this finally worked for me after I tried to solve this lab around five or six times.

import argparse

import vertexai
from vertexai.generative_models import GenerativeModel, Part, Image

def analyze_bouquet_image(image_path: str):
    vertexai.init(
        project='PROJECT_ID',
        location='REGION',
    )
    
    multimodal_model = GenerativeModel("gemini-2.0-flash-001")
    
    messages = [
        "Generate a birthday wish based on the following image",
        Part.from_image(Image.load_from_file(location=image_path))
    ]

    chat = multimodal_model.start_chat()

    print(chat.send_message(content=messages, stream=False))

analyze_bouquet_image(
    image_path='IMAGE_PATH'
)

View solution in original post