I have the following volume mounting setup for my Batch jobs.
host machine (my development server that I use to submit Batch job request): bucket_name -> mount_path
guest machine (the machine provisioned by Batch to actually run batch task): bucket_name -> mount_path
docker (container to run task that sits on guest_machine): mount_path -> mount_path
The reason I do this is that the same command echo "hello" > /mount_path/hello.txt (in reality, the command will be much more complicated and the use of container will be much more justified, here just an example) that I run locally on the host machine would work the same as in Batch and the file will be written to to GCS.
However, this creates a lot of overhead because
1.we need to gcsfuse on every guest machine
2.download the docker container image to guest machine every time.
I am wondering if there is a way to achieve the same effect and reduce overhead? For example, maybe keep a machine image where it is already gcsfuse-d and the docker image is already there? My job config is as follows. Thanks!
"taskSpec": {
"runnables": [
{
"container": {
"imageUri": image_uri,
"commands": echo hello > mount_path/hello.txt,
"volumes": ["mount_path:mount_path"]
},
}
],
"volumes": [
{
"mountPath": mount_path,
"gcs": {"remotePath": bucket_name}
}
]
}