Solved: Build a Multi-Modal GenAI Application Task 2 Issue

thinkverse

In Build a Multi-Modal GenAI Application: Challenge Lab, I cannot get the second task to trigger. The task says ;

"Develop a second Python function called analyze_bouquet_image(image_path). This function will take the image path as input along with a text prompt to generate birthday wishes based on the image passed and send it to the gemini-2.0-flash-001 model. To ensure responses can be obtained as and when they are generated, enable streaming on the prompt requests."

I faced several issues with this task. When I first tried using from google import genai I got an error saying Python couldn't find genai, so I had to install it, when that was solved I tried checking my progress and it said I needed to wait for the logs to generate, so I added the same logging code used in the previous labs. Now that didn't work at first given the same issues as genai with Python saying it couldn't find google.cloud, so I had to again install it. Installing it with the following command solved those issues at least.

/usr/bin/pip3 install --upgrade google-genai google-cloud-logging

So I've tried building the function with both genai and with vertexai, both stream their result and do get the image and prompt.

genai version

from google import genai
from google.genai.types import HttpOptions, Part, Image

def analyze_bouquet_image(image_path: str):
    client = genai.Client(
        vertexai=True,
        project='PROJECT_ID',
        location='REGION',
        http_options=HttpOptions(api_version="v1")
    )
    
    chat = client.chats.create(model="gemini-2.0-flash-001")
    
    messages = [
        "Generate birthday wishes based on the provided image",
        Part.from_bytes(data=Image.from_file(location=image_path).image_bytes, mime_type="image/jpeg")
    ]

    for chunk in chat.send_message_stream(messages):
        print(chunk.text, end="")

analyze_bouquet_image(
    image_path='image.jpeg'
)

vertexai version

import vertexai
from vertexai.generative_models import GenerativeModel, Part, Image

import logging
from google.cloud import logging as gcp_logging

# ------  Below cloud logging code is for Qwiklab's internal use, do not edit/remove it. --------
# Initialize GCP logging
gcp_logging_client = gcp_logging.Client()
gcp_logging_client.setup_logging()

def analyze_bouquet_image(image_path: str):
    vertexai.init(
        project='PROJECT_ID',
        location='REGION',
    )
    
    multimodal_model = GenerativeModel("gemini-2.0-flash-001")
    
    messages = [
        "Generate birthday wishes based on the provided image",
        Part.from_image(Image.load_from_file(location=image_path))
    ]

    for chunk in multimodal_model.generate_content(contents=messages, stream=True):
        print(chunk.text, end="")

analyze_bouquet_image(
    image_path='image.jpeg'
)

So both versions of the function work, the signature matches, they call the right model, and both stream their responses. I think the issue comes with the logging, which I omitted above, but I use the same Qwiklab's internal GCP logging code from earlier labs.

import logging
from google.cloud import logging as gcp_logging

# ------  Below cloud logging code is for Qwiklab's internal use, do not edit/remove it. --------
# Initialize GCP logging
gcp_logging_client = gcp_logging.Client()
gcp_logging_client.setup_logging()

I'm wondering if I'm missing something in the setup. I've tried logging the streamed response, but I get a similar message to "failed to save log" in the console. When I don't try to manually log something, that error doesn't appear.

Does anyone have any ideas about what the issue might be? It could be staring me in the eyes, and I'm just missing it. Any help would be appreciated.

thinkverse

Thank you for trying to help @RAOKS. I finally solved it, even though the challenge prompt states: "To ensure responses can be obtained as and when they are generated, enable streaming on the prompt requests."

What finally solved Task #2 for me was to use multimodal_model.start_chat() and set send_message(stream=False). Which, given the prompt, shouldn't be right since streaming isn't enabled. 🤷‍♂️

If anyone else is having issues, then this finally worked for me after I tried to solve this lab around five or six times.

import argparse

import vertexai
from vertexai.generative_models import GenerativeModel, Part, Image

def analyze_bouquet_image(image_path: str):
    vertexai.init(
        project='PROJECT_ID',
        location='REGION',
    )
    
    multimodal_model = GenerativeModel("gemini-2.0-flash-001")
    
    messages = [
        "Generate a birthday wish based on the following image",
        Part.from_image(Image.load_from_file(location=image_path))
    ]

    chat = multimodal_model.start_chat()

    print(chat.send_message(content=messages, stream=False))

analyze_bouquet_image(
    image_path='IMAGE_PATH'
)

View solution in original post

RAOKS

Please post exact total code finally you have executed for Task #1 and Task #2, so that we can test and find solution.

thinkverse

@RAOKS for Task #1, I used the following, which created the required image. I saved the file as GenerateImage.py

import argparse

import vertexai
from vertexai.preview.vision_models import ImageGenerationModel

def generate_bouquet_image(
    prompt: str
) -> vertexai.preview.vision_models.ImageGenerationResponse:

    vertexai.init(
        project='PROJECT_ID',
        location='REGION',
    )

    model = ImageGenerationModel.from_pretrained("imagen-3.0-generate-002")

    images = model.generate_images(
        prompt=prompt,
        # Optional parameters
        number_of_images=1,
        seed=1,
        add_watermark=False,
    )

    images[0].save(location='image.jpeg')

    return images


generate_bouquet_image(
    prompt='Create an image containing a bouquet of 2 sunflowers and 3 roses',
)

I run it using the following command

/usr/bin/python3 /GenerateImage.py

Now for Task #2, I have two versions. I use the same file that I saved as GenerateText.py. I just switch the function code instead of creating two files.

genai version

from google import genai
from google.genai.types import HttpOptions, Part, Image

import logging
from google.cloud import logging as gcp_logging

# ------  Below cloud logging code is for Qwiklab's internal use, do not edit/remove it. --------
# Initialize GCP logging
gcp_logging_client = gcp_logging.Client()
gcp_logging_client.setup_logging()

def analyze_bouquet_image(image_path: str):
    client = genai.Client(
        vertexai=True,
        project='PROJECT_ID',
        location='REGION',
        http_options=HttpOptions(api_version="v1")
    )
    
    chat = client.chats.create(model="gemini-2.0-flash-001")
    
    messages = [
        "Generate birthday wishes based on the provided image",
        Part.from_bytes(data=Image.from_file(location=image_path).image_bytes, mime_type="image/jpeg")
    ]

    for chunk in chat.send_message_stream(messages):
        print(chunk.text, end="")

analyze_bouquet_image(
    image_path='image.jpeg'
)

And then we have the vertexai version

import vertexai
from vertexai.generative_models import GenerativeModel, Part, Image

import logging
from google.cloud import logging as gcp_logging

# ------  Below cloud logging code is for Qwiklab's internal use, do not edit/remove it. --------
# Initialize GCP logging
gcp_logging_client = gcp_logging.Client()
gcp_logging_client.setup_logging()

def analyze_bouquet_image(image_path: str):
    vertexai.init(
        project='PROJECT_ID',
        location='REGION',
    )
    
    multimodal_model = GenerativeModel("gemini-2.0-flash-001")
    
    messages = [
        "Generate birthday wishes based on the provided image",
        Part.from_image(Image.load_from_file(location=image_path))
    ]

    for chunk in multimodal_model.generate_content(contents=messages, stream=True):
        print(chunk.text, end="")

analyze_bouquet_image(
    image_path='image.jpeg'
)

For Task #2, I also had to either install or upgrade google-genai and google-cloud-logging.

/usr/bin/pip3 install --upgrade google-genai google-cloud-logging

Then I run the function like I did with Task #1.

/usr/bin/python3 /GenerateText.py

Again, both version for Task #2 streams the response like the prompt says, so I'm a bit lost.

RAOKS

@thinkverse

Below code is working for me for Task #2, Task #1 of your code is fine.

Please note , as image generated from task is image.jpeg, I have mentioned same in code.

============================================

import vertexai
from vertexai.generative_models import GenerativeModel, Part, Image

def analyze_bouquet_image(project_id: str, location: str) -> str:
vertexai.init(project=project_id, location=location)
multimodal_model = GenerativeModel("gemini-2.0-flash-001")
response = multimodal_model.generate_content(
[
Part.from_image(Image.load_from_file("image.jpeg")
),
"Generate birthday wishes based on this image?",
]
)
return response.text

project_id = "PROJECT_ID"
location = "REGION"

response = analyze_bouquet_image(project_id, location)
print(response)

=================================================

Output

=================================

Here are some birthday wishes based on the image of the bouquet:

**General & Sweet:**

* "Wishing you a birthday as bright and beautiful as this bouquet!"
* "Happy Birthday! May your day be filled with joy and sweetness, just like these roses and sunflowers."
* "Sending you sunshine and roses on your birthday! Have a wonderful day."

**Focusing on the Flowers:**

* "May your birthday bloom with happiness, just like these vibrant sunflowers!"
* "Wishing you a birthday filled with the beauty and fragrance of roses and the warmth of sunflowers."
* "Just like this bouquet, may your birthday be a perfect mix of joy, love, and happy moments."

**More Personal/Specific:**

* "Happy Birthday! I hope your day is as radiant as you are and filled with the vibrant colors of this bouquet."
* "Sending you a bunch of happy birthday wishes! May your day be as lovely and cheerful as these sunflowers."

**Simple and Short:**

* "Happy Birthday! Hope your day is beautiful!"
* "Wishing you a flower-filled birthday!"
root@3c8356f0b9ff:/home/student#

================================================

I have modified your code , its giving output , but check #Task #2 is not working

================================================

import vertexai
from vertexai.generative_models import GenerativeModel, Part, Image

#import logging
#from google.cloud import logging as gcp_logging

# ------ Below cloud logging code is for Qwiklab's internal use, do not edit/remove it. --------
# Initialize GCP logging
#gcp_logging_client = gcp_logging.Client()
#gcp_logging_client.setup_logging()

def analyze_bouquet_image(image_path: str):
vertexai.init(
project='PROJECT_ID',
location='REGION',
)

multimodal_model = GenerativeModel("gemini-2.0-flash-001")

messages = [
"Generate birthday wishes based on the provided image",
Part.from_image(Image.load_from_file(location=image_path))
]

for chunk in multimodal_model.generate_content(contents=messages, stream=True):
print(chunk.text, end="")

analyze_bouquet_image(
image_path='image.jpeg'
)

=================

Output

====================

Here are some birthday wishes inspired by the image of the sunflowers and roses:

**Option 1 (Simple & Bright):**

"Wishing you a birthday as bright and beautiful as these sunflowers and roses! Hope your day is filled with sunshine and joy."

**Option 2 (Elegant & Warm):**

"Happy Birthday! May your day be as lovely and special as this bouquet of roses and sunflowers. Wishing you a year filled with beauty and happiness."

**Option 3 (Focus on Growth & Happiness):**

"Happy Birthday! Just like these sunflowers reaching for the sun, may you always strive for joy and growth. And may the roses bring you all the love and beauty this year!"

**Option 4 (Poetic):**

"To you on your birthday: May your day be as radiant as these sunflowers, as loving as the roses, and as beautiful as you are. Happy Birthday!"

**Option 5 (Personal & Heartfelt):**

"Sending you the warmest birthday wishes! These sunflowers remind me of your radiant personality, and the roses symbolize the love and appreciation I have for you. Have a wonderful day!"
root@3c8356f0b9ff:/home/student#

================================

in your code you are generating messages, response is expected as per check Task #2

Well tried, Keep learning, All the best. You always learn when errors are generated.

AkhilLegendx

Task 2 not working for me

RAOKS

# Task #2 code with explanation
==================================
import vertexai
from vertexai.generative_models import GenerativeModel, Part, Image

def analyze_bouquet_image(project_id: str, location: str) -> str:
vertexai.init(project=project_id, location=location)
multimodal_model = GenerativeModel("gemini-2.0-flash-001")
response = multimodal_model.generate_content(
[
Part.from_image(Image.load_from_file("image.jpeg")
),
"Generate birthday wishes based on this image?",
]
)
return response.text

project_id = "PROJECT_ID"
location = "REGION"

response = analyze_bouquet_image(project_id, location)
print(response)
==========================================================
Task #2 script uses "Google Vertex AI" and the "Gemini AI model" to generate birthday wishes based on an image file named "image.jpeg" generated in Task #1.
The process is as follows:
1. Initialize Vertex AI with a project ID and region.
2. Load the Gemini AI model for generative tasks.
3. Provide an image (`image.jpeg`) and a text prompt to the model.
4. Generate and return a response based on the image and prompt.
--------------------------------------------------------------------------------
# Code Explanation - Task #2
# 1. Import Required Libraries
##python
----------------------------------
import vertexai
from vertexai.generative_models import GenerativeModel, Part, Image
-----------------------------------------
- vertexai → Enables interaction with Google Vertex AI.
- GenerativeModel → Loads the Gemini model to process text and images.
- `Part` & `Image` → Helps process images and pass them into the model.
-------------------------------------------
# 2. Define the Function
##python
------------------
def analyze_bouquet_image(project_id: str, location: str) -> str:
---------------------
- This function analyzes an image and generates a birthday wish.
- It takes `project_id` and `location` as parameters, ensuring it runs within the correct Google Cloud environment.
---------------------------
#3. Initialize Vertex AI
##python
--------------------------
vertexai.init(project=project_id, location=location)
------------------------------
- Connects to Google Cloud Vertex AI using the specified `project_id` and `location`.
-------------------------------------
#4. Load the Gemini Model
#python
----------------------
multimodal_model = GenerativeModel("gemini-2.0-flash-001")
-------------------------
- Loads Gemini-2.0-flash-001, an AI model designed for fast generative responses.
------------------------------
##5. Generate Content Using an Image
#python
----------------------------------------
response = multimodal_model.generate_content(
[
Part.from_image(Image.load_from_file("image.jpeg")),
"Generate birthday wishes based on this image?",
]
)
---------------------------------------
- Image.load_from_file("image.jpeg")→ Loads the image file.
- Part.from_image(...) → Converts the image into a format that the model understands.
- "Generate birthday wishes based on this image?" → Provides context to the AI model.
- generate_content(...) → Passes both the image and text prompt to Gemini AI for processing.
-----------------------------------------------------------------
#6. Return the Generated Response
##python
-------------------
return response.text
--------------------------------------
- Returns the AI-generated birthday wish as a text response.
-----------------------------------------
##7. Define Project ID & Location
#python
---------------------------------------
project_id = "PROJECT_ID"
location = "REGION"
-----------------------------------------------------
- These should be replaced with actual values from your Google Cloud project.
-------------------------------------------------------------
#8. Call the Function & Print Output
#python
------------------------------
response = analyze_bouquet_image(project_id, location)
print(response)
--------------------------------------------------
- Calls the function with the correct project ID and location.
- Prints the AI-generated birthday wish.
=================
###Important
========================
Replace `"PROJECT_ID"` and `"REGION"` with actual values(otherwise, Vertex AI won't connect).
Check if `image.jpeg` exists in the correct path (avoid file-not-found errors).

dhagash_dc

i was getting import error for genai version.

thinkverse

If you wanna use Googles genai library you first need to install it using pip3, open a terminal and use install, keep in mind that the code above while it does work, it does not however work as a solution to task #2.

/usr/bin/pip3 install --upgrade google-genai

RAOKS

If you want total automated script for entire lab. I have uploaded in below post , please refer to the post. Keep learning. All the best.

Build a Multi-Modal GenAI Application: Challenge Lab (bb-ide-genai-004) solution code

AkhilLegendx

Please Give me a correct Working Task Codes.

thinkverse

I've marked what worked for me as the solution, give that a look.

RAOKS

Please refer to below post

Build a Multi-Modal GenAI Application: Challenge Lab (bb-ide-genai-004) solution code

RAOKS

I have posted working code for #Task2, with explanation on last Saturday. (19th Apr 25)

As you want both codes (Task #1, Task #2) and procedure to execute, I have posted today.

Tushh

I am facing this problem in running task #1.

Unsupported region for Vertex AI, select from frozenset({'australia-southeast2', 'europe-west2', 'europe-west9', 'asia-southeast1', 'australia-southeast1', 'me-central2', 'us-east1', 'europe-west1', 'me-west1', 'europe-central2', 'global', 'northamerica-northeast2', 'us-south1', 'us-east4', 'asia-northeast3', 'asia-east2', 'us-central1', 'asia-northeast2', 'europe-southwest1', 'southamerica-west1', 'us-west2', 'europe-west3', 'europe-north1', 'europe-west12', 'us-west4', 'me-central1', 'southamerica-east1', 'asia-southeast2', 'us-west1', 'us-west3', 'asia-northeast1', 'europe-west4', 'us-east5', 'europe-west6', 'asia-east1', 'europe-west8', 'africa-south1', 'asia-south1', 'northamerica-northeast1'})

RAOKS

From Error it is clear you have not replaced PROJECT_ID and REGION with your labs details. replace PROJECT_ID with your labs project id and REGION with your lab Region . Please do the same for Task #1 and Task #2 code.

If you want total automated script for entire lab. I have uploaded in below post , please refer to the post. Keep learning. All the best.

Build a Multi-Modal GenAI Application: Challenge Lab (bb-ide-genai-004) solution code

thinkverse

Thank you for trying to help @RAOKS. I finally solved it, even though the challenge prompt states: "To ensure responses can be obtained as and when they are generated, enable streaming on the prompt requests."

What finally solved Task #2 for me was to use multimodal_model.start_chat() and set send_message(stream=False). Which, given the prompt, shouldn't be right since streaming isn't enabled. 🤷‍♂️

If anyone else is having issues, then this finally worked for me after I tried to solve this lab around five or six times.

import argparse

import vertexai
from vertexai.generative_models import GenerativeModel, Part, Image

def analyze_bouquet_image(image_path: str):
    vertexai.init(
        project='PROJECT_ID',
        location='REGION',
    )
    
    multimodal_model = GenerativeModel("gemini-2.0-flash-001")
    
    messages = [
        "Generate a birthday wish based on the following image",
        Part.from_image(Image.load_from_file(location=image_path))
    ]

    chat = multimodal_model.start_chat()

    print(chat.send_message(content=messages, stream=False))

analyze_bouquet_image(
    image_path='IMAGE_PATH'
)

AkhilLegendx

Thanks @RAOKS.