Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Deploy to local endpoint Vertex AI - Health check never succeeds

Hello,

I am trying the deploy_to_local_endpoint function but without success.

The first step is

from google.cloud.aiplatform.prediction import LocalModel
from src_dir.predictor import MyCustomPredictor
import os

local_model = LocalModel.build_cpr_model(
{LOCAL_SOURCE_DIR},
f"{REGION}-docker.pkg.dev/{PROJECT_ID}/{REPOSITORY}/{IMAGE}",
predictor=MyCustomPredictor,
requirements_path=os.path.join(LOCAL_SOURCE_DIR, "requirements.txt"),
)

/opt/conda/lib/python3.10/subprocess.py:955: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
  self.stdin = io.open(p2cwrite, 'wb', bufsize)

This step runs successfully although I get the above mentioned warning

The second step fails unfortunately. The request run for 2 minutes and then I get "The health check never succeeds" before even getting into "predict_response" line.

with local_model.deploy_to_local_endpoint(
    artifact_uri
= 'model_artifacts/', # local path to artifacts
) as local_endpoint:
    predict_response
= local_endpoint.predict(
        request_file
='instances.json',
        headers
={"Content-Type": "application/json"},
   
)

    health_check_response
= local_endpoint.run_health_check()

The model_artifacts/ folder has model.pkl file.

Would appreciate your help!

1 REPLY 1