Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

H100 (a3-highgpu) instances and LocalSSD

Are there any known bugs with mounting local SSDs on H100 (a3-highgpu family) instances? I create my batch jobs using the python SDK, and typically I can create a batch_v1.AllocationPolicy.Disk() object configured with type_="local-ssd" and size_gb=[whatever the size is of local ssd disks]. In the case of the a3-highgpu family, local-ssd is automatically provisioned, so I make the size_gb whatever is automatically provisioned when the instance spins up.

My problem is that for a3-highgpu specifically, the disks are attached as expected, but they are not mounted on my chosen mount point. With a similar node family like the a2-ultragpu (which also automatically attaches local-ssd disks), by simply creating the above AllocationPolicy.Disk() object, I can successfully get the instances to automatically raid the local-ssd drives and mount them on the expected mountpoint.

Is this something you've run into before? I've been comparing the outputs from some test runs, and the only difference I can see is that for the a3-highgpu family, the boot disk is automatically allocated as /dev/nvme0n1, whereas with the a2-ultragpu family, the boot disk is automatically allocated as /dev/sda. Maybe that is the problem, since local-ssd are all provisioned as nvmeXn1, and perhaps that is causing some small problem in the background?

0 1 520
1 REPLY 1

Hi @thessjacob,

Welcome to Google Cloud Community!

According to Class Disk (0.17.32) - Local SSDs are available through both "SCSI" and "NVMe" interfaces. If not indicated, "NVMe" will be the default one for local SSDs. This field is ignored for persistent disks as the interface is chosen automatically. See interface types.

Based on this Machine series comparison, only NVMe disks are available for A3 series whereas both SCSI and NVMe disk types are supported in the A2 series. For reference, you can view the NVMe device names.

Screenshot 2024-12-27 4.57.37 AM.png

If you have any more concerns or questions, you may reach out to Google Cloud Support for further assistance. 

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.