nginx proxy -- sidecar or serverless vpc connector?

Hello -- I asked this on the GCP Slack but didn't hear much, so thought I'd try here.

I’m trying to understand the best way to deploy nginx as a reserve proxy in front of a python app server (gunicorn), via cloud run.It seems like the most recommended option is a sidecar.

I read through the article on this exact method here (https://cloud.google.com/run/docs/internet-proxy-nginx-sidecar). My hang up with this approach is scaling. I have this application set up in an existing kubernetes cluster so I have some idea of how it scales up. My nginx pods never scale past 2 because nginx is so efficient…even when the gunicorn pods scale up to 100 or more.

In this scenario, would cloud run sidecar option also scale up nginx to 100 instances, when gunicorn does so? That feels weird to me if so, as I’d have a lot of underutilized nginx instances (unless I’ve misunderstood something). Curious if I have that right or not, or anything else I should consider here.

It seems like the other option is to deploy nginx and gunicorn as two separate cloud run services, and use a  serverless VPC connector as outlined here (https://chuntezuka.medium.com/connect-2-cloud-run-services-internally-with-serverless-vpc-connector-...). This would keep nginx and gunicorn’s scaling independent. Any drawbacks or issues with doing things this way?

Thanks! 

 

 

Solved Solved
1 2 148
1 ACCEPTED SOLUTION

Hello @ianfitz ,

Welcome to the Google Cloud Community!

In the sidecar pattern, you're correct: both NGINX and your application (Gunicorn) are bundled into the same Cloud Run service, sharing resources like CPU and memory, and scaling in tandem. Each request to the service is processed by both NGINX and Gunicorn within the same container instance, streamlining networking as inter-container communication occurs locally, thus reducing latency.

Deploying NGINX and Gunicorn as separate Cloud Run services, linked via a Serverless VPC Connector, enables them to scale independently. NGINX would manage incoming requests, forwarding them to Gunicorn, which can then adjust its scale based on its specific load.

Additionally, consider utilizing Cloud Run’s built-in autoscaling for Gunicorn and, if deployed as separate services, NGINX. This facilitates automatic adjustment of scaling in response to incoming requests.

View solution in original post

2 REPLIES 2

Hello @ianfitz ,

Welcome to the Google Cloud Community!

In the sidecar pattern, you're correct: both NGINX and your application (Gunicorn) are bundled into the same Cloud Run service, sharing resources like CPU and memory, and scaling in tandem. Each request to the service is processed by both NGINX and Gunicorn within the same container instance, streamlining networking as inter-container communication occurs locally, thus reducing latency.

Deploying NGINX and Gunicorn as separate Cloud Run services, linked via a Serverless VPC Connector, enables them to scale independently. NGINX would manage incoming requests, forwarding them to Gunicorn, which can then adjust its scale based on its specific load.

Additionally, consider utilizing Cloud Run’s built-in autoscaling for Gunicorn and, if deployed as separate services, NGINX. This facilitates automatic adjustment of scaling in response to incoming requests.

Okay @juliadeanne , that all makes sense. I'll try the two-service approach so they can scale independently, connecting them via serverless VPC connector. 

Thanks much, was very helpful to validate this was a normal way of doing things!