Announcements
This site is in read only until July 22 as we migrate to a new platform; refer to this community post for more details.
Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Seeing new error mounting GCS bucket on Google Cloud Batch

I've been using Google Cloud Batch for a couple of weeks and have had no real problems adopting it for a number of workloads. However, starting today, I have been getting errors like this when mounting a GCS bucket:

Command error:
mkdir: cannot create directory '/mnt/MY-BUCKET': Read-only file system

I looked at log entries from the same types of jobs yesterday and did not see the same error, despite the same mount command being in the logs. Has anyone else seen a change in behavior? Any suggestions?

Added detail: This error appears to only affect container jobs, not script jobs, at least based on a little testing. 

Solved Solved
1 6 1,785
1 ACCEPTED SOLUTION

There was a recent change to the Batch in which Container-Optimized OS (COS) is used for container only jobs. We are in the process of updating the documentation to reflect this change. In the meantime, the workaround is to mount the GCS bucket under a writable path in COS as "/mnt/disks/share" instead of "/mnt/share". We'll reply to this thread once the documentation is updated.

View solution in original post

6 REPLIES 6

Hi @seandavi - can you send details on your job and the job UID to gcp-batch-preview@google.com? We can look into this to gather more details.

There was a recent change to the Batch in which Container-Optimized OS (COS) is used for container only jobs. We are in the process of updating the documentation to reflect this change. In the meantime, the workaround is to mount the GCS bucket under a writable path in COS as "/mnt/disks/share" instead of "/mnt/share". We'll reply to this thread once the documentation is updated.

Hey, Shamel. Is this `/mnt/disks/share` still a must when mounting a disk for GCP batch run? 

Hi,

Can't access my storage bucket data using that writable path "ls: cannot access '/mnt/disks': No such file or directory". I am triggering a task with GPU, could this be a conflict with the volumes used by the container? 

{
    "taskGroups":[
        {
        "taskSpec":{
            "computeResource": {
                "cpuMilli": "500",
                "memoryMib": "500"

            },
            "runnables": [
            {
                "container": {
                    "image_uri": "tensorflow/tensorflow:2.11.0",
                    "commands": ["-c", "echo $(ls /mnt/disks)"],
                    "entrypoint": "/bin/sh",
                    "volumes": ["/var/lib/nvidia/lib64:/usr/local/nvidia/lib64", "/var/lib/nvidia/bin:/usr/local/nvidia/bin"],
                    "options": "--privileged"
                  }
            }
            ],
            "volumes": [
            {
                "gcs": {
                "remotePath": "MYBUCKET"
                },
                "mountPath": "/mnt/disks/MYBUCKET"
            }
            ]
        },
        "taskCount": 1
        }
    ],
    "allocation_policy": {
        "instances": [
            {
                "installGpuDrivers": true,
                "policy": {
                    "machineType": "n1-standard-2",
                    "accelerators": [
                        {
                            "type": "nvidia-tesla-t4",
                            "count": 1
                        }
                    ]
                }
            }
        ]
      },
      "logsPolicy": {
        "destination": "CLOUD_LOGGING"
    }
}

 

I think I found the reason, the bucket is mounted at VM level so the container has no access. Adding it as a volume solves the problem

@pabjusae Mounting the path from VM to container will solve the issue. Note that Batch can automatically mount the bucket to containers to the same path as the host VM. But it only does that if the "volumes" field in container is empty.