Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Big issue/bug with the calculation of parallelTasksPerVM in GCP Batch

This wiki https://cloud.google.com/batch/docs/create-run-job outlines the determination of parallelTasksPerVM. However, a big issue is that it determines parallelTasksPerVm ONLY by cpu resources and WITHOUT taking memory resource into account, resulting in many tasks killed with exit code 137 (running out of memory). It took me a long while to figure out that this is the root cause of my task failures, I am wondering if the team could help resolve this? Thanks in advance!

gradientopt_0-1702183840300.png

 

1 2 245
2 REPLIES 2

Hi @gradientopt,

Thanks for your feedback! We are tracking on considering memory into parallel Tasks per VM calculation. We'll let you know once we have progress.

Thanks!

Wenyan

Hi @gradientopt,

There are different aspects when calculating taskPerVM calculation. Now we only used CPU. Sorry for the inconvenience. As Wenyan said, we will keep you posted on the the process of taking memory into consideration.

Short term work around could be specifying taskCountPerNode in the job specifications. 

Thanks!