Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Batch not seeming to scale down running instances when last few tasks are running

So I'm running a batch on some workloads that process for a few hours. There are around 50 such tasks, each is independent. I'm finding that when I'm running these tasks in parallel, when I'm at the tail end of my workload, I tend to get some instances that are just doing nothing. 

 

For example, I have 6 tasks running, with 44 completed.  However, I have 8 instances running (my quota maximum). To me, it seems that these should be shutdown, as there is no more tasks to assign to them. But this does not seem to be the case. 

0 2 107
2 REPLIES 2

Hi @LinearParadox1,

Welcome to Google Cloud Community! 

It is recommended to use Auto-Scaling in the Managed Instance Group

Auto-scaling in GCP Batch is primarily managed through the associated Managed Instance Group (MIG). You need to define scaling policies that respond to the workload. The core strategy is to scale down when CPU utilization is low (indicating idle VMs). Check here the CLI command of the instance groups managed.

For more additional guidance, explore the following documentation:

  • Batch Overview
  • Create and run a basic Batch job - while this doesn't directly cover auto-scaling, it provides context for setting up Batch jobs and environments.
  • Cloud Monitoring Overview - If you want to explore custom metrics

    If you need further assistance and any questions, please reach out to our Google Cloud Support team.


    Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.

If you are using GCP Batch, could you provide the job UID for us to take a look?