Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Error in deploying docker container on vertex AI

Hi everyone,

I am trying to deploy a YOLO custom container model on Vertex AI using containerization. I have successfully:

Built and tested the Docker image locally.
Verified that the API (FastAPI + YOLO) runs correctly in the container.
Successfully deployed and tested the same image on Cloud Run.

However, when deploying on Vertex AI as a custom container model, I am facing issues.

Setup Details:

  • Base Image: pytorch/pytorch:2.2.0-cuda12.1-cudnn8-runtime
    (for GPU support)
  • Serving Framework: FastAPI + Uvicorn + Gunicorn
  • Hardware Target: Trying to use n1-standard-4 with NVIDIA p4 GPU
  • Docker CMD: The container runs startup.sh which loads models and starts the FastAPI server.
  • Inference Request Format: Expects base64 encoded images in application/json.

Deployment Steps:

  1. Built the Docker image locally and tested it with docker run.
  2. Pushed the image to Google Artifact Registry.
  • Created a Vertex AI Model using:
 

 

- name: Upload Mobile Model
        run: |
          EXISTING_MODEL=$(gcloud ai models list --region=$REGION --filter="displayName=freshpet-mobile" --format="value(name)" --limit=1)
          if [ -z "$EXISTING_MODEL" ]; then
            gcloud ai models upload --region=$REGION --display-name=freshpet-mobile --container-image-uri=$IMAGE_NAME_MOBILE
          else
            echo "Mobile model already exists, skipping upload."
          fi

      - name: Wait for Mobile Model to be Registered
        run: |
            timeout=$TIMEOUT_SECONDS
            start_time=$(date +%s)
            while true; do
              MODEL_MOBILE_NAME=$(gcloud ai models list --region=$REGION --filter="displayName=freshpet-mobile" --format="value(name)" --limit=1)
              if [ -n "$MODEL_MOBILE_NAME" ]; then
                echo "Model Registered: $MODEL_MOBILE_NAME"
                echo "MODEL_MOBILE_NAME=$MODEL_MOBILE_NAME" >> $GITHUB_ENV
                break
              fi
              if [ $(( $(date +%s) - start_time )) -gt $timeout ]; then
                echo "Timeout waiting for model registration." && exit 1
              fi
              sleep 10
            done

 

  1. Created an Endpoint and deployed the model.

 

 - name: Deploy Mobile Model to Endpoint
        run: |
          ENDPOINT_MOBILE_ID=$(gcloud ai endpoints list --region=$REGION --filter="displayName=freshpet-mobile-endpoint" --format="value(name)" --limit=1)
          if [ -z "$ENDPOINT_MOBILE_ID" ]; then
            ENDPOINT_MOBILE_ID=$(gcloud ai endpoints create --region=$REGION --display-name=freshpet-mobile-endpoint --format="value(name)")
          fi
          gcloud ai endpoints deploy-model $ENDPOINT_MOBILE_ID \
            --region=$REGION \
            --model=$MODEL_MOBILE_NAME \
            --display-name=mobile-container-deploy \
            --machine-type=$MACHINE_TYPE \
            --accelerator=count=1,type=$GPU_TYPE \
            --min-replica-count=$MIN_REPLICAS \
            --enable-access-logging \
            --autoscaling-metric-specs=$AUTOSCALING_METRIC​

 

Issues Faced:

  • The model fails to load on Vertex AI (while it works perfectly on Cloud Run).
  • No logs appear in the Vertex AI endpoint, making debugging difficult.
  • Screenshot from 2025-03-17 13-59-47.png
  • When I try to send a request, I get 503 Service Unavailable or Container Failed to Start errors.

Questions:

  1. Does Vertex AI require additional configurations for FastAPI-based custom containers?
  2. How can I enable GPU support correctly inside the Vertex AI container?
  3. Is there a different logging mechanism I should use to debug why the container is failing?
  4. Are there specific health check requirements for Vertex AI containers?

Any help would be greatly appreciated! Thanks in advance. 🙏

0 1 71
1 REPLY 1

error mail

Hello Vertex AI Customer,

Due to an error, Vertex AI was unable to deploy model "freshpet-mobile@1".
Additional Details:
Operation State: Failed with errors
Resource Name: 
projects/1061052074258/locations/us-central1/models/5571072585824731136
Error Messages: Model server never became ready. Please validate that your 
model file or container configuration are valid. Model server logs can be 
found at