Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Cloud Run Auto Scaling behaviour

I  have a scraping app that uses 2 Cloud Run Services. A frontend CR service built in React and obviously stateless and a backend NodeJS CR service that is private ONLY (accessibly only by front end service). The backend service connects to a Cloud SQL instance via private IP and is  also using a REDIS instance. There is also a static egress IP configured for the backend  CR service to be able to be whitelisted by a 3rd party API provider.

Cloud Run backend end Configuration

  1. Cloud Run 2nd Gen
  2. 1 vCPU and 512GB RAM
  3. CPU is only allocated during request processing
  4. Min 0 and Max 5 instances

What I am seeing is a bunch of 500 and 429 errors for the Cloud Run backend service. No issue with the front end service. As per CR metrics I do not see any cold starts (0 -> 1 scaling). Max concurrent requests is also very low (around 10)

I am looking to take my  app to production and I see that just for around 60 requests , CR instances are scaling to 4 and want to understand the auto scaling behaviour  of CR . CPU utilisation is around 64% (60% is the default CPU utilization that is OOB setting from what I understand)  and memory utilisation is around 15%.  Is the auto scaling happening because CPU Utlization is > 60% and is it possible to change this setting?  Question really is why is CR scaling to 4-5 instances with just 60 requests with low CPU  and memory usage (I say low because it is around 60% and memory). I don’t see any other reason that explains the auto scaling behaviour for example Cloud SQL not keeping up with CR autoscaling or any SIGTERM crashes which might cause CR to spin up a new instance.

Below are a few metrics 

Max_Concurrent_Requests.pngContainer_Memory_Utilization.pngContainer_CPU_Utilization.png

0 2 3,027
2 REPLIES 2

Hello,

Cloud Run indeed scales up when instance CPU utilization reaches 60%; this is not configurable. Cloud Run prioritizes getting your requests served quickly, so it scales up rather than risk not being able to serve a request. We are looking at ways to make scaling more configurable, but today this is how it works.

Do the errors go away if you raise the max instances setting?

Thanks @knet . Yes the errors go away once I increase max instances setting but it reoccurs at a later point. As i said I am seeing 429 and 500 errors and they are intermittent once in 3 days but there are around 15 - 20 occurences of both errors in the past 2 weeks

 

Top Solution Authors