Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

MIG does not scale based on CPU utilization

In the past few days, we have our MIG stop scaling, causing instance's CPU to overload and underutilized when it's in low peak

The issue only occurs when `Instances as predicted` goes to 0

We are having the same issue in both our staging and production environment

Is there any ongoing issue from GCP??

Production:

KhooHaoYit_0-1745919167577.png

Staging:

KhooHaoYit_0-1745921143832.png

 

0 1 85
1 REPLY 1

Hi @KhooHaoYit,

Welcome to Google Cloud Community!

Here are some basic troubleshooting steps you can follow:

  1. Adjust Predictive Autoscaling Settings:

The quickest and simplest solution is to disable Predictive Autoscaling in the instance group settings for both staging and production environments via Google Cloud Console. When the predicted instance count is set to zero, the autoscaler may shut down all instances, failing to anticipate sudden traffic spikes and potentially overloading the remaining instances. To avoid this, review and adjust your predictive autoscaling settings to better align with your application's actual traffic patterns.

  1. Check the Network and Firewall Rules:

Ensure your firewall rules permit traffic to reach your instances and that health checks aren't being blocked, as misconfigurations can disrupt proper functionality. Additionally, verify that your instances are correctly set up to accept traffic from your load balancer or other ingress sources.

  1. Monitor and Adjust Initialization Period

Ensure that your initialization period accounts for your application's startup time, as an incorrect setting can trigger premature scaling. After disabling Predictive Autoscaling, monitor CPU utilization to confirm that CPU-based autoscaling is functioning as expected. If scaling isn’t responsive enough, consider adjusting parameters like target CPU utilization or cooldown period. You can also revisit Predictive Autoscaling later if needed, but ensure it’s properly configured and closely monitored.

The issue appears to stem from Predictive Autoscaling being configured to forecast zero instances, which may override standard CPU-based scaling and lead to the observed behavior. This is not due to any ongoing GCP outage, but rather a configuration issue within your control—especially since it affects both staging and production environments. While it's wise to check the Google Cloud Status Page, the root cause is almost certainly tied to your current autoscaling setup and its reliance on CPU utilization alone.

Additionally, consider consulting with our Google Cloud Support to help you get a clearer idea of how to adjust your autoscaling configurations and mitigate the scaling issues you're facing. 

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.