Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

CODE_VOLUME_INVALID_ARGUMENT: Batch job creation error while running with intstance template

Hello Team,

I am trying to run a cloud batch job (container runnable) using batch_v1 python  API using an instance template and I am getting the below error which seems Ito me it's not able to find the persistent disk when the persistent disk can be seen under google compute/disks.

As per the volume method dis is a string and I am passing it as below:

resources = batch_v1.ComputeResource()
# The milliCPU count.
# cpuMilli defines the amount of CPU resources per task in milliCPU units.
# For example, 1000 corresponds to 1 vCPU per task.
resources.cpu_milli = 4000
resources.memory_mib = 16000
volume = batch_v1.Volume()
volume.device_name = "google-xxxxxxxx-gke-batch-dev-08ab6cad-7"
volume.mount_path = "/mnt/disks/batch"
volume.mount_options = ["rw", "async"]

Can some one help me debug this issue or point me if I am missing anything here.

Error Description:

Job gets non-retryable information Batch Error: code - CODE_VOLUME_INVALID_ARGUMENT, description - when mounting device, the job abc-orch-202-580b1037-5e67-47330 in project 999999999999 cannot find valid PD info from vmSchedulingInfo map[group0:vm_schedulings:{vm:{machine_type:"e2-standard-4" cpu_milli:4000 memory_mib:16384 boot_disk:{new_disk:{type:"pd-standard" size_gb:200 disk_interface:"SCSI" image:"projects/xxxxxxxxx-xxxx-xxxx/global/images/xxxx-golden-rhel-8-2023-08-11t07-12-33z"} device_name:"abc-gke-batch-dev-08ab6cad-7"} network:{network_interfaces:{network:"https://www.googleapis.com/compute/v1/projects/xxxxxxxxx/global/networks/vpc-xxxx" subnetwork:"https://www.googleapis.com/compute/v1/projects/xxxxxxxxxregions/us-central1/subnetworks/xxxxxxx" no_external_ip_address:true}} instance_template:"abc-gke-batch-dev-08ab6cad-7"} task_pack:1}], which should not happen.

Thanks!

G

Solved Solved
1 4 308
2 ACCEPTED SOLUTIONS

I think I figured out this issue. I just removed the device name and added the mount type which is GCS and got rid of this error.

View solution in original post

Hi @gkadam2011,

You can also follow https://cloud.google.com/batch/docs/create-run-job-storage#use-persistent-disk if you want to mount with PD.

For example, in the doc, it has one paragraph showing:

```For a job that uses persistent disks, this instance template must define and attach the persistent disks that you want the job to use. For this example, the template must define and attach a new persistent disk named NEW_PERSISTENT_DISK_NAME and and attach an existing persistent disk named EXISTING_PERSISTENT_DISK_NAME.```

The device name you use in your volume needs to match the device name you use in your instance template, if that's the case.

Thanks!

View solution in original post

4 REPLIES 4

I think I figured out this issue. I just removed the device name and added the mount type which is GCS and got rid of this error.

Hi gkadam2011@, another thing you can check is the volume device_name, seems the configuration is "google-xxxxxxxx-gke-batch-dev-08ab6cad-7", while the PD is "abc-gke-batch-dev-08ab6cad-7".

Thanks for using Batch!

Hi @gkadam2011,

You can also follow https://cloud.google.com/batch/docs/create-run-job-storage#use-persistent-disk if you want to mount with PD.

For example, in the doc, it has one paragraph showing:

```For a job that uses persistent disks, this instance template must define and attach the persistent disks that you want the job to use. For this example, the template must define and attach a new persistent disk named NEW_PERSISTENT_DISK_NAME and and attach an existing persistent disk named EXISTING_PERSISTENT_DISK_NAME.```

The device name you use in your volume needs to match the device name you use in your instance template, if that's the case.

Thanks!

Thanks @wenyhu 

Yes I figured out. The "device_name" was not the culprit. The issue was the Volume method needs a mount type GCS or NFS which. I was missing in my code.

Thanks!