Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Best way to run a data processing task on GCP, with API

Hi all,

Basically I would like to know the best way to perform a containerized data-processing task via GCP. So far I've tried both Cloud Run Services and Cloud Run Jobs, but both have their issues - Services being that the task run time exceeds the time the container is up, and Jobs being that it doesn't take arguments on run.

The scripts in question may run from 30 mins to an hour, or even longer. It's mostly about video processing, so they need as much computer power as they can get. I haven't found a good way to increase and decrease containers as needed, while being able to access them from an API. Is there a serverless way to do this? Or any other good methods?

Solved Solved
0 2 1,068
1 ACCEPTED SOLUTION

RC1
Bronze 4
Bronze 4

@JuliaFaltech 
For high compute processing usually we go for a compute engine which provides that compute power using powerful CPUs and GPUs. Can you check GCP batches once =>  https://cloud.google.com/batch/docs/create-run-job#api_2

https://cloud.google.com/blog/products/compute/new-batch-service-processes-batch-jobs-on-google-clou...
Here you can provide the type of compute instance and a bash script which runs your code.
like

git clone your code
python main.py 

...

etc .

Scaling can be a problem here since you have to predefined the number of instances. 

View solution in original post

2 REPLIES 2

RC1
Bronze 4
Bronze 4

@JuliaFaltech 
For high compute processing usually we go for a compute engine which provides that compute power using powerful CPUs and GPUs. Can you check GCP batches once =>  https://cloud.google.com/batch/docs/create-run-job#api_2

https://cloud.google.com/blog/products/compute/new-batch-service-processes-batch-jobs-on-google-clou...
Here you can provide the type of compute instance and a bash script which runs your code.
like

git clone your code
python main.py 

...

etc .

Scaling can be a problem here since you have to predefined the number of instances. 

This seems like it would work, thank you. I think it should be alright as long as I can scale to 0. I will be accepting this as a solution, thanks again.