Shared Session Management for ADK Agents on Cloud ...

fnavarro94

I recently followed this tutorial:

https://codelabs.developers.google.com/deploy-manage-observe-adk-cloud-run#1

I was able to successfully deploy an agent on Cloud Run that scales horizontally while maintaining consistent session management by using an external PostgreSQL database to store sessions across instances.

Now, I’d like to achieve the same behavior using the Vertex AI Session Service instead of PostgreSQL. Is this possible? Has anyone done this before?

I tried replacing the PostgreSQL URI with the Agent Engine service URI, but it didn’t work as expected. Any guidance or examples would be greatly appreciated.

Thanks!

ruthseki

Hi fnavarro94,

Welcome to Google Cloud Community!

Replacing the PostgreSQL URI with the Agent Engine URI won’t work because the session service expects a structured object, not a URI string. You need to instantiate the correct session service class and configure it with your project and location.

And yes, migrating from a self-managed PostgreSQL instance to the managed Vertex AI Session Service for ADK session state management is the recommended approach for production deployments. This transition shifts the responsibility for session state persistence from a self-managed database to a highly-available, scalable, and fully-managed Google Cloud service, thereby reducing operational overhead and eliminating a potential performance bottleneck.

Here are some documentations and example that may provide you with the necessary implementation guidance:

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.

fnavarro94

Hi rutheski,

Thank you for your reply and for sharing the helpful links. I’d like to provide more context on my use case and the issues I’m encountering.

I’m building an app that needs to handle thousands—or potentially tens of thousands—of concurrent chats with an agent. I was able to successfully deploy the agent using Vertex AI Agent Builder (Agent Engine), but during stress testing I noticed it doesn’t handle concurrent calls well. For example, when I send 20 concurrent requests, the responses arrive sequentially, with the last one taking nearly a minute to respond. This behavior seems to align with the documented rate limit of ~60 RPM (i.e., ~1 RPS), which is insufficient for my expected load.

I then deployed the same agent on Cloud Run, following this tutorial (), and observed much better performance. All 20 concurrent requests completed in parallel, each with an average latency of about 3 seconds. This solution seems scalable and promising.

However, I’m unclear on how to configure this Cloud Run setup with a session store other than PostgreSQL. I have two questions I am struggling to find answers to in online documentation and tutorials:

Was I using Vertex AI Agent Engine incorrectly?
Is it designed to handle thousands of concurrent requests? Based on my testing and the rate limits, it seems not—but I’d love to be corrected if I missed something.
Can Cloud Run be configured to use Firestore or Vertex AI Session Service for shared session management?
I would like all Cloud Run instances to share the same session data, just as the tutorial demonstrates using PostgreSQL. This set up involves a set of horizontally scalable adk cloud run instances which all need to connect to the same node that handles session management for session consistency when the same user requests gets redirected to a different adk instance during the conversation. The tutorials and documentation dont show how to do this other than with postresql, but Postgressql is itself not the ideal database selection for this use case.

Any guidance or clarification you can provide on these points would be greatly appreciated.

Thanks again,

Felipe

Shared Session Management for ADK Agents on Cloud Run – PostgreSQL vs. Vertex AI Session Service