We have a vertex ai public endpoint and our custom model deploy on it successfully. But when tried to deploy the same model to a private endpoint, it failed with the following error in the log.
"OSError: We couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like meta-llama/Meta-Llama-3-8B is not the path to a directory containing a file named config.json."
"Checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'."
It looks like when use the private endpoint, the compute node has no internet access. I used the following command to create the private endpoint.
gcloud beta ai endpoints create \
--display-name=ENDPOINT_DISPLAY_NAME \
--network=FULLY_QUALIFIED_NETWORK_NAME \
--region=REGION
Anyone faced the similar issue and any solution on this?