Hey!
Context: I followed this official GCP tutorial to run a custom model and, through the steps, I have containerized my Python code (Artifact Registry) and I was then able to run it as a custom training job (Vertex AI > Training > Custom Jobs).
Problem: Once this custom job finished running, I did not find any way to re-trigger it. It seems that the only way is to recreate it which is not efficient. I have also tried using Pipelines but it seems that a one-off pipeline can only be ran once, and I am not looking for a scheduled pipeline.
General question: I am only looking for a way to train custom ML models with a specified compute power (with GPUs) and to be able to trigger the training through the GCP interface at any time I want. What is the recommended way to do so? If possible by using the least tools...
Thanks a lot for your help!