I'm using model adapters, where each query might require one of thousands of adapters. I don't want to copy all the adapters to my endpoint, but rather have the ability to download them at inference time. I'm looking for a faster storage solution than Google Bucket, and was wondering if I could mount Filestore to the Vertex online prediction endpoint. Thanks!
Hi, I am working on similar use case. There is a working example to mount NFS share for a Vertex AI custom training job here (https://cloud.google.com/vertex-ai/docs/training/train-nfs-share), its a shame we don't have an example for Endpoint.
Having said that, I managed to create Cloud filestore with "Allow All" access set up and tried mounting it as part of start up script in a custom container image. I have deployed the image on a private Vertex AI endpoint. So far could not manage to get it working. I tested the filestore mount from VM and it connects successfully, but not from endpoint. The logging doesn't help much.
Based on the console and API - you can not use Filestore (https://cloud.google.com/sdk/gcloud/reference/ai/endpoints/create) with the endpoint. Depending how you trained the model thou and if you used a "prediction container" and that container mounted Filestore - you could utilize it that way, not sure if it'll give you a performance gain you're looking for
-Use a custom container for prediction | Vertex AI | Google Cloud,
Custom container requirements for prediction | Vertex AI | Google Cloud