Solved: Scheduling Vertex AI Pipeline - Error 503

rubenszmm · 05-10-2023 11:19 AM

Hi,

I successfully trained and deployed a pipeline in Vertex AI using Kubeflow for a retrieval model, Two Towers.

Now I want to schedule this pipeline run every 8 minutes. Here's my code:

from kfp.v2.google.client import AIPlatformClient
api_client = AIPlatformClient(project_id='my-project', region='us-central1')

api_client.create_schedule_from_job_spec(
    job_spec_path='vacantes_pipeline.json',
    schedule="/8 * * * *", # every 8 minutes
    time_zone='America/Sao_Paulo',
    parameter_values={
        "epochs_": 5,
    "embed_length":768,  
        "maxsplit_" : 130
    }
)

The JSON is successfuly created, but the Scheduler Job fails immediately.

Logging tells me the httpRequest has an error 503 plus:

jsonPayload: {
@type: "type.googleapis.com/google.cloud.scheduler.logging.AttemptFinished"
jobName: "projects/my-project/locations/us-central1/jobs/pipeline_vacantes-pipeline-with-deployment_c7e98a8f_59-14-a-a-a"
status: "UNAVAILABLE"
targetType: "HTTP"
url: "https://us-central1-bogotatrabaja.cloudfunctions.net/templated_http_request-v1"
}

Any ideas on how to solve this issue ?

rubenszmm

I solved with Compute Engine and cron jobs.

View solution in original post

Roderick

Thanks so much for sharing! I am not a Vertex expert, but learning! Hoping others will weigh in, but this sounds like a permissions issue based on some other scenarios I've seen.

Here are some ideas on how to troubleshoot the issue of the Vertex AI Scheduler Job failing immediately with a 503 error:

Check the Cloud Logging logs. The Cloud Logging logs will provide more information about the error, such as the specific API that is failing and the error message. This information can be used to troubleshoot the issue.
Make sure that the Vertex AI API is enabled. The Vertex AI API must be enabled in the Google Cloud Platform Console before it can be used. To enable the API, go to the APIs & Services page in the Console and search for Vertex AI. Click the Enable button to enable the API.
Make sure that the Vertex AI service account has the correct permissions. The Vertex AI service account must have the correct permissions to create and run Scheduler jobs. To grant the service account the necessary permissions, go to the IAM & Admin page in the Console and select the Service accounts tab. Click the name of the Vertex AI service account and then click the Edit button. In the Roles section, select the Cloud Scheduler Editor role.
Check the Vertex AI quota. The Vertex AI quota may be exceeded. To check the quota, go to the Quotas page in the Console and search for Vertex AI. The Quotas page will show the current usage and the available quota for each Vertex AI resource.
If you continue to experience issues, Contact Google Cloud support. If you have tried all of the above and you are still having trouble, you can contact Google Cloud support for help.

I hope this helps!

rubenszmm

I solved with Compute Engine and cron jobs.