How to reduce overhead of running container jobs i... - Page 2

gradientopt · 10-27-2023 10:16 AM

I have the following volume mounting setup for my Batch jobs.

host machine (my development server that I use to submit Batch job request):  bucket_name -> mount_path

guest machine (the machine provisioned by Batch to actually run batch task): bucket_name -> mount_path

docker (container to run task that sits on guest_machine): mount_path -> mount_path

The reason I do this is that the same command echo "hello" > /mount_path/hello.txt (in reality, the command will be much more complicated and the use of container will be much more justified, here just an example) that I run locally on the host machine would work the same as in Batch and the file will be written to to GCS.

However, this creates a lot of overhead because

1.we need to gcsfuse on every guest machine

2.download the docker container image to guest machine every time.

I am wondering if there is a way to achieve the same effect and reduce overhead? For example, maybe keep a machine image where it is already gcsfuse-d and the docker image is already there? My job config is as follows. Thanks!

"taskSpec": {
  "runnables": [
    {
      "container": {
        "imageUri": image_uri,
        "commands": echo hello > mount_path/hello.txt,
        "volumes": ["mount_path:mount_path"]
      },
    }
  ],
  "volumes": [
    {
      "mountPath": mount_path,
      "gcs": {"remotePath": bucket_name}
    }
  ]
}

How to reduce overhead of running container jobs in Batch?