Hi,
I am trying to deploy a simple model on an endpoint in order to start making predictions.
I followed these steps:
1. Create the model, create a docker image, push it into Artifact Registry
2. Upload the model to Vertex AI:
! gcloud ai models upload --container-image-uri=<REGION>-docker.pkg.dev/<PROJECT>/<REPOS>/<MODEL>:latest --region=<REGION> --display-name=my-model
3. Create an endpoint on Vertex AI:
! gcloud ai endpoints create --display-name=my-model-endpoint --region=<REGION>
4. Deploy the model to the endpoint:
! gcloud ai endpoints deploy-model <ENDPOINT_ID> --model <MODEL_ID> -display-name my-model --traffic-split=0=100 --region <REGION> --machine-type=n1-standard-8 --enable-access-logging
Here's the message:
Model server terminated: model server container terminated: exit_code: 0 reason: "Completed" started_at { seconds: 1684309447 } finished_at { seconds: 1684309449 } . Model server logs can be found at xxxxx
When i check the logs:
As you can see i have no errors.
What i tried:
- upsized the machine-type
- deployed with a python script
- deployed manually using the portal
- changed my model to a simple linear regression to test
Good day @zied_gobji,
Welcome to Google Cloud Community!
This issue might have been caused by several reasons, it might be due to the model being not deployed correctly or the endpoint was not created. You can check the following solutions if it will work on your end:
1. Try creating a new endpoint and try deploying it there also verify that the model exists.
2. Please note that when you create a model resource, the size of the artifacts that you've specified in the artifactUri must be 10 GB or less. You can check this link for more information: https://cloud.google.com/vertex-ai/docs/training/exporting-model-artifacts#maximum_model_size
3. Make sure also that you are meeting the container requirements. You can check this link for more information: https://cloud.google.com/vertex-ai/docs/predictions/custom-container-requirements#variables
But I highly suggest that you reach out to Google Cloud Support for further investigation in your case: https://cloud.google.com/support
Hope this will help!