We have a vertex ai public endpoint and our custom model deploy on it successfully. But when tried to deploy the same model to a private endpoint, it failed with the following error in the log.
"OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like meta-llama/Meta-Llama-3-8B is not the path to a directory containing a file named config.json."
"Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'."
It looks like when use the private endpoint, the compute node has no internet access. I used the following command to create the private endpoint.
gcloud beta ai endpoints create \
--display-name=ENDPOINT_DISPLAY_NAME \
--network=FULLY_QUALIFIED_NETWORK_NAME \
--region=REGION
Anyone faced the similar issue and any solution on this?
Hi @Brian_oozou,
Welcome to Google Cloud Community!
The error message indicates that Vertex AI private endpoint's compute instances lack internet access, preventing them from downloading the necessary model files from Hugging Face (meta-llama/Meta-Llama-3-8B). This can be a common issue when deploying models to private environments where external access is restricted for security reasons. Here are some things you can consider to address the issue:
Check Network Configuration: Ensure that your private endpoint has the necessary network configurations to access the internet. You might need to configure Private Service Connect or VPC Network Peering.
Pre-download and Upload Model Weights:
Ensure that the service account used by your Vertex AI endpoint has the necessary permissions to read from the GCS bucket.
If you face any issues with permissions, you might need to adjust IAM roles or bucket permissions in the Google Cloud Console.
Use a Model Registry: Instead of downloading the model every time you deploy, consider using the Vertex AI Model Registry.
I hope the above information is helpful.
User | Count |
---|---|
2 | |
1 | |
1 | |
1 | |
1 |