Store in SSD

Hello everyone !!

I would like to know how to store a dataset in SSD when using a GCP VM ? And how to access the dataset stored in SSD in our code (Python program). Also, how to download a file into a VM using the terminal? I am new to this, so, this would be of immense help.

And it is given that the data stored in Local SSD is lost once the VM is stopped or deleted, so where exactly is my data (programs and all) saved.

thank you

pavithra

Solved Solved
0 3 245
1 ACCEPTED SOLUTION

Hi there,

Depending on the performance characteristics you are looking for as well as data persistence features there are a few options, "Local SSD" is one, but there are others.

Have a look at this page on disks: https://cloud.google.com/compute/docs/disks

These are the disks providing block storage that are attached to your VM - you will be using one already to run the VM and you can attach others.  In terms of SSD specifically you have a few options.  First there are "Persistent Disks" or "PD", these as the name suggests are persistent, you can stop and start a VM and the data will persist.  You have several options for SSD on PD:

"Balanced" persistent disks provide, as the name suggests, an SSD backed storage option that has a balance of price and performance.

"SSD" persistent disks take it a step up and provide better performance but that comes at a higher cost.

There is "Extreme" persistent disks, also SSD backed, but these are really for specific use cases and require some specific VM configurations too.

If however the use of the storage is temporary or ephemeral - for example for some high speed cache or similar, you could also look to the "Local SSD" option as you mention.  This provides very high speed locally attached SSD storage, but your application needs to be prepared that if the VM is stopped or terminated for any reason all the data will be lost.  This is useful for very specific application scenarios, but if you need the data to persist you will need to take steps to safe guard it and perhaps look to Persistent disks instead.

In terms of accessing the files, these disks are all attached as local drives to the VM, so in Python you can access data stored on them like any other file in the file system.

If you want to download a file from the terminal - assuming you are using a Linux based OS not Windows - the easiest thing would be "scp", look here for some guidance on using "scp" as part of our gcloud CLI tool: https://cloud.google.com/sdk/gcloud/reference/compute/scp

This 3rd party article also provides some more background and guidance on "scp" and its use, so worth a read: https://www.freecodecamp.org/news/scp-linux-command-example-how-to-ssh-file-transfer-from-remote-to-...

Hope that helps,

Alex

View solution in original post

3 REPLIES 3

Hi there,

Depending on the performance characteristics you are looking for as well as data persistence features there are a few options, "Local SSD" is one, but there are others.

Have a look at this page on disks: https://cloud.google.com/compute/docs/disks

These are the disks providing block storage that are attached to your VM - you will be using one already to run the VM and you can attach others.  In terms of SSD specifically you have a few options.  First there are "Persistent Disks" or "PD", these as the name suggests are persistent, you can stop and start a VM and the data will persist.  You have several options for SSD on PD:

"Balanced" persistent disks provide, as the name suggests, an SSD backed storage option that has a balance of price and performance.

"SSD" persistent disks take it a step up and provide better performance but that comes at a higher cost.

There is "Extreme" persistent disks, also SSD backed, but these are really for specific use cases and require some specific VM configurations too.

If however the use of the storage is temporary or ephemeral - for example for some high speed cache or similar, you could also look to the "Local SSD" option as you mention.  This provides very high speed locally attached SSD storage, but your application needs to be prepared that if the VM is stopped or terminated for any reason all the data will be lost.  This is useful for very specific application scenarios, but if you need the data to persist you will need to take steps to safe guard it and perhaps look to Persistent disks instead.

In terms of accessing the files, these disks are all attached as local drives to the VM, so in Python you can access data stored on them like any other file in the file system.

If you want to download a file from the terminal - assuming you are using a Linux based OS not Windows - the easiest thing would be "scp", look here for some guidance on using "scp" as part of our gcloud CLI tool: https://cloud.google.com/sdk/gcloud/reference/compute/scp

This 3rd party article also provides some more background and guidance on "scp" and its use, so worth a read: https://www.freecodecamp.org/news/scp-linux-command-example-how-to-ssh-file-transfer-from-remote-to-...

Hope that helps,

Alex

Thank you, this has given me a very good understanding.

I would like to clarify another thing.

I tried copying a file from my local (Ubuntu) machine to a virtual machine using the command,

gcloud compute scp ~/localtest.txt ~/localtest2.txt example-instance:~/narnia

But, I am getting the following error :

Command 'gcloud' not found.

I tried to install it using

sudo snap install google-cloud-cli

And I got the following message:

error: This revision of snap "google-cloud-cli" was published using classic confinement and thus
may perform arbitrary system changes outside of the security sandbox that snaps are usually
confined to, which may put your system at risk.

If you understand and want to proceed repeat the command including --classic.

So, can I proceed the installation using the command:

sudo snap install google-cloud-cli --classic

Thank you,

Pavithra

Hi,

Best reference for installation of the CLI tooling is this one: https://cloud.google.com/sdk/docs/install

Hope that helps

Alex