Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Maximum concurrent requests per instance on Cloud Run

I am a deploying an R shiny app to Cloud Run. My question is about the option maximum concurrent requests per instance. What does a request mean here?
In my app, I use REST API to access firestore documents. So, a single user makes multiple HTTP requests while using the app. Are these counted as requests on Cloud Run? Or do only app sessions (app opened in a new tab/new browser) count as "requests" on Cloud Run?

Solved Solved
0 4 1,850
1 ACCEPTED SOLUTION

Hello udurrani,

You'll probably want to use the session affinity feature here: https://cloud.google.com/run/docs/configuring/session-affinity

If you turn on session affinity, Cloud Run will do its best to send requests from the same session to the same instance, but it's not a 100% guarantee. See the doc for details.

View solution in original post

4 REPLIES 4

Hi @udurrani,

Welcome to Google Cloud Community!

Based on the documentation, a "request" in the context of Cloud Run's "maximum concurrent requests per instance" setting refers to a single HTTP request to your application.

Therefore, if a single user makes multiple HTTP requests to your Shiny app (e.g., to fetch data from Firestore via your REST API), each of those individual HTTP requests counts as a request against the maximum concurrency limit of your Cloud Run instance. It's not just the app session (new tab/browser) that's counted. Each time your app makes a call to the Firestore API, for example, that counts as a new request in the context of the Cloud Run instance.

The document explicitly states that the setting controls the maximum number of requests that can be processed simultaneously by a single instance. If a user initiates several requests within a short period, they all count toward that limit. If the limit is reached, additional requests will be queued or handled by other Cloud Run instances as needed.

I hope the above information is helpful.

Thanks a lot for your detailed response. I have a follow-up question. Suppose 19 users are active and making multiple HTTP requests to firestore. Initially, they'd be in a single instance. What if 19 users have already made 78 concurrent requests and now the 20th user makes 6 requests in a short time? If a new container instance is started now, the 20th user will be in which instance, first or the second? R Shiny apps have a unique session for each browser tab the app is opened in. I suspect the 20th user will see some weird behavior/error in the app. But I am not sure if that's correct. Any insights about that?

Hello udurrani,

You'll probably want to use the session affinity feature here: https://cloud.google.com/run/docs/configuring/session-affinity

If you turn on session affinity, Cloud Run will do its best to send requests from the same session to the same instance, but it's not a 100% guarantee. See the doc for details.

Thanks, @knet. Session affinity seems very useful for my use case.