Hi,
I have a docker image that I am able to run using batch and it uses the resources that are needed from a bucket using volume gcs such as in the example here https://cloud.google.com/batch/docs/create-run-job-storage#use-bucket
The issue I have is the very high latency to read and generate the intermediate files in the bucket, it literally takes more than 10 hours, where locally takes ~2 hours to produce these files using the same machine-type. I see that an option is to use persistent disks to reduce latency but I am not aware on how to connect/bind this new pd-disk and also be able to use the resources from the bucket. My intuition is to maybe copy the resources needed from the bucket to the pd-disk, then generate the intermediate files there, and finally copy the output to a bucket?
Thanks in advanced any help!
Diego
Solved! Go to Solution.
Hi @dmontielg,
Welcome to Google Cloud Community!
Based on the documentation that you provided, using Cloud Storage bucket is automatically mounted to your VM using Cloud Storage FUSE. One of its disadvantages is as follows:
If you prefer a persistent disk, you need to add it first to your VM. You may also check the restrictions for all persistent disks.
There are also other options like local SSD and network file storage that you may use for storage volumes.
Hope these help.
Hi @dmontielg,
Welcome to Google Cloud Community!
Based on the documentation that you provided, using Cloud Storage bucket is automatically mounted to your VM using Cloud Storage FUSE. One of its disadvantages is as follows:
If you prefer a persistent disk, you need to add it first to your VM. You may also check the restrictions for all persistent disks.
There are also other options like local SSD and network file storage that you may use for storage volumes.
Hope these help.