I am new to Vertex AI and wanted to try it out for a Kaggle competition. I was able to get a GPU machine up and running, as well as download the data to the machine. The download script was automatically generated when uploading my notebook to Vertex AI. I ran the script and 5 hours later all of the data was there successfully (to the boot disk - standard persistent disk with 1000 GB). I then ran a first iteration of my model and everything worked great. When I was done, I went back to GCP and stopped my VM, assuming all of my data would be saved. It was not!
I then started over and once the data was on the machine I took a snapshot so I wouldn't have to redownload the data a third time. I then made some edits to my model and ran it again. After I was done, I again stopped my VM to not leave it running. All of the data was lost again, but less surprisingly this time.
I thought a snapshot could be used as a backup to the original machine, but the documentation makes it seem like it is only for creating a new VM from the boot disk. I then made a new machine but cannot figure out how to use it. I also tried looking for a way to make a new notebook on Vertex with the disk snapshot, but it did not look possible.
Questions:
gcloud compute snapshots create SNAPSHOT_NAME \
--source-disk SOURCE_DISK \
--source-disk-zone SOURCE_DISK_ZONE
gcloud compute instances detach-disk $INSTANCE_NAME --disk $DATA_DISK_NAME --zone $ZONE
gcloud compute disks delete $DATA_DISK_NAME --zone $ZONE
Thanks, that is helpful but I still do not understand why the data was deleted in the first place? It says it is a persistent disk
User | Count |
---|---|
2 | |
2 | |
1 | |
1 | |
1 |