Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Gunicorn (Flask) app deployed on Cloud Run - What are the Optimization techniques?

I deployed a Python/Flask app on Cloud Run with gunicorn as the web server. I am seeing errors on the gunicorn side and my guess there is some optimization that needs to be done.  Cloud Run by itself has several fine tuning parameters like concurrency, no of workers and add to it the extra configuration of gunicorn (workers, threads,  type of events etc it becomes complicated .  What are good optimization techniques for Cloud Run + Gunicorn combo?

Screenshot 2024-08-06 at 5.15.55 PM.png

0 4 947
4 REPLIES 4

A few recommendations on optimization (basically, start from low params and increase gradually to determine the best specs for your app)

1) Set Cloud run concurrency at low point - let's say, 10.
2) The same as the previous one for CPU and RAM - start with 1 CPU and 0.5 GiB of RAM.
3) For the Gunicorn settings: "2 * CPU + 1" formula to set Gunicorn workers - 3 in case of starting with 1 core. Start 2 Gunicorn threads for 1 CPU core and set timeout to 30 seconds.

Thanks @yrhsk . How does vCPU map to the cores. My CR instance has 2vCPU and 2GiB memory

Please check this doc: https://cloud.google.com/run/docs/container-contract#cpu . Since it's an "approximate equialent CPU time of a single hardware hyper-thread on variable CPU platforms", that's a half of a physical CPU core for a hyper-threading enabled CPU, since one CPU core in hyper-threading enabled CPUs is represented by two virtual threads (vCPUs in our case).

This is the most deadliest combination in my opinion (CR + gunicorn). Sometimes too much abstraction is also not good, you need to know the internals.