Hello GCP community, I have the following question, I am training in Vertex using a custom container, I am porting pipelines that were in Kubeflow to vertex and using this to train:
from google.cloud import aiplatform
job = aiplatform.CustomContainerTrainingJob(display_name="training-job", container_uri=container_uri)
# define training code arguments
training_args = ["--num-epochs", "2", ]
model = job.run(
replica_count=1,
machine_type="n1-standard-8",
accelerator_type="NVIDIA_TESLA_V100",
accelerator_count=1,
args=training_args,
sync=False,
)
It looks ok, but here is my question is there anyway in which I can do the training but in a SPOT machine to try to reduce my training costs.
Thanks!
Solved! Go to Solution.
Hi David
No unfortunately there is no support for spot / preemptible instances with Vertex AI.
Hi David
No unfortunately there is no support for spot / preemptible instances with Vertex AI.
Is there a way to attach persistent disks when submitting a custom training job using Vertex AI?
User | Count |
---|---|
2 | |
1 | |
1 | |
1 | |
1 |