Solved: Build a Multi-Modal GenAI Application: Challenge L...

RAOKS · 04-21-2025 06:12 AM

#===================================
# Build a Multi-Modal GenAI Application: Challenge Lab
#======================================================================
#working code with details
#after Task #1 code executed , see beautiful image (image.jpeg)
# by clicking file under explorer in lab screen.
#-----------------------------------------------------------------
#----------------------------------------------------------------------
#Run below code in Terminal --ex: root@581cced7a969:/home/student#
#=================================================
# Set the PROJECT_ID and REGION variables
#============================================

export PROJECT_ID=$(gcloud config get-value project)
export REGION=$(gcloud compute project-info describe --format="value(commonInstanceMetadata.items[google-compute-default-region])")

#=========================================================
# Task #1 code 
#===========================================
cat > GenerateImage.py <<EOF_TASK_ONE

import argparse

import vertexai
from vertexai.preview.vision_models import ImageGenerationModel

def generate_bouquet_image(
    project_id: str, location: str, output_file: str, prompt: str ) -> vertexai.preview.vision_models.ImageGenerationResponse:
    """Generate an image using a text prompt.
    Args:
      project_id: Google Cloud project ID, used to initialize Vertex AI.
      location: Google Cloud region, used to initialize Vertex AI.
      output_file: Local path to the output image file.
      prompt: The text prompt describing what you want to see."""

    vertexai.init(project=project_id, location=location)

    model = ImageGenerationModel.from_pretrained("imagen-3.0-generate-002")

    images = model.generate_images(
        prompt=prompt,
        # Optional parameters
        number_of_images=1,
        seed=1,
        add_watermark=False,
    )

    images[0].save(location=output_file)

    return images

generate_bouquet_image(
    project_id='$PROJECT_ID',
    location='$REGION',
    output_file='image.jpeg',
    prompt='Create an image containing a bouquet of 2 sunflowers and 3 roses',
    )
EOF_TASK_ONE
#=========================================
# below command executes GenerateImage.py

/usr/bin/python3 GenerateImage.py
#========================================
#Task #2 code
#==============================================
cat > Gentext.py <<EOF_TASK_TWO
import vertexai
from vertexai.generative_models import GenerativeModel, Part, Image

def analyze_bouquet_image(project_id: str, location: str) -> str:
  vertexai.init(project=project_id, location=location)
  multimodal_model = GenerativeModel("gemini-2.0-flash-001")
  response = multimodal_model.generate_content(
    [
      Part.from_image(Image.load_from_file("image.jpeg")
      ),
      "Generate birthday wishes based on this image?",
    ]
  )
  return response.text

project_id = "$PROJECT_ID"
location = "$REGION"

response = analyze_bouquet_image(project_id, location)
print(response)
EOF_TASK_TWO
#=========================================
# below command executes Gentext.py

/usr/bin/python3 Gentext.py
#=====================================

-------------------------------------------------------------------------------------------------------------

Screen shot of image,(Task #1) birthday wishes (Task#2) , 100/100 score

-------------------------------------------------------------------------------------------------------

RAOKS

Thank you for confirmation and happy to help you.

View solution in original post

Sriyaa

Thank you. It's working

RAOKS

Thank you for confirmation and happy to help you.

mahi2435

it is not working

RAOKS

code is correct and I have tested today also. Indent is changing after posting code in community. every line in code is left indent. when you run the code it's giving indent error.

Solution: I have replaced same code above (earlier in text format) and uploaded as python code. Try now. After running code wait for few mins for Task #1 to complete (reason - it will generate image)

RAOKS

#===================================
# Build a Multi-Modal GenAI Application: Challenge Lab
#======================================================================
#working code with details
#after Task #1 code executed , see beautiful image (image.jpeg)
# by clicking file under explorer in lab screen.
#-----------------------------------------------------------------
#----------------------------------------------------------------------
#Run below code in Terminal --ex: root@581cced7a969:/home/student#
#=================================================
# Set the PROJECT_ID and REGION variables
#============================================

export PROJECT_ID=$(gcloud config get-value project)
export REGION=$(gcloud compute project-info describe --format="value(commonInstanceMetadata.items[google-compute-default-region])")

#=========================================================
# Task #1 code 
#===========================================
cat > GenerateImage.py <<EOF_TASK_ONE

import argparse

import vertexai
from vertexai.preview.vision_models import ImageGenerationModel

def generate_bouquet_image(
    project_id: str, location: str, output_file: str, prompt: str ) -> vertexai.preview.vision_models.ImageGenerationResponse:
    """Generate an image using a text prompt.
    Args:
      project_id: Google Cloud project ID, used to initialize Vertex AI.
      location: Google Cloud region, used to initialize Vertex AI.
      output_file: Local path to the output image file.
      prompt: The text prompt describing what you want to see."""

    vertexai.init(project=project_id, location=location)

    model = ImageGenerationModel.from_pretrained("imagen-3.0-generate-002")

    images = model.generate_images(
        prompt=prompt,
        # Optional parameters
        number_of_images=1,
        seed=1,
        add_watermark=False,
    )

    images[0].save(location=output_file)

    return images

generate_bouquet_image(
    project_id='$PROJECT_ID',
    location='$REGION',
    output_file='image.jpeg',
    prompt='Create an image containing a bouquet of 2 sunflowers and 3 roses',
    )
EOF_TASK_ONE
#=========================================
# below command executes GenerateImage.py

/usr/bin/python3 GenerateImage.py
#========================================
#Task #2 code
#==============================================
cat > Gentext.py <<EOF_TASK_TWO
import vertexai
from vertexai.generative_models import GenerativeModel, Part, Image

def analyze_bouquet_image(project_id: str, location: str) -> str:
  vertexai.init(project=project_id, location=location)
  multimodal_model = GenerativeModel("gemini-2.0-flash-001")
  response = multimodal_model.generate_content(
    [
      Part.from_image(Image.load_from_file("image.jpeg")
      ),
      "Generate birthday wishes based on this image?",
    ]
  )
  return response.text

project_id = "$PROJECT_ID"
location = "$REGION"

response = analyze_bouquet_image(project_id, location)
print(response)
EOF_TASK_TWO
#=========================================
# below command executes Gentext.py

/usr/bin/python3 Gentext.py
#=====================================

mahi2435

this code is not working

Build a Multi-Modal GenAI Application: Challenge Lab (bb-ide-genai-004) solution code