Hi all,
Basically I would like to know the best way to perform a containerized data-processing task via GCP. So far I've tried both Cloud Run Services and Cloud Run Jobs, but both have their issues - Services being that the task run time exceeds the time the container is up, and Jobs being that it doesn't take arguments on run.
The scripts in question may run from 30 mins to an hour, or even longer. It's mostly about video processing, so they need as much computer power as they can get. I haven't found a good way to increase and decrease containers as needed, while being able to access them from an API. Is there a serverless way to do this? Or any other good methods?
Solved! Go to Solution.
@JuliaFaltech
For high compute processing usually we go for a compute engine which provides that compute power using powerful CPUs and GPUs. Can you check GCP batches once => https://cloud.google.com/batch/docs/create-run-job#api_2.
https://cloud.google.com/blog/products/compute/new-batch-service-processes-batch-jobs-on-google-clou...
Here you can provide the type of compute instance and a bash script which runs your code.
like
git clone your code
python main.py
...
etc .
Scaling can be a problem here since you have to predefined the number of instances.
@JuliaFaltech
For high compute processing usually we go for a compute engine which provides that compute power using powerful CPUs and GPUs. Can you check GCP batches once => https://cloud.google.com/batch/docs/create-run-job#api_2.
https://cloud.google.com/blog/products/compute/new-batch-service-processes-batch-jobs-on-google-clou...
Here you can provide the type of compute instance and a bash script which runs your code.
like
git clone your code
python main.py
...
etc .
Scaling can be a problem here since you have to predefined the number of instances.
This seems like it would work, thank you. I think it should be alright as long as I can scale to 0. I will be accepting this as a solution, thanks again.