Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

How can i stream word by word from an adk agent deployed in vertex ai agent engine?

I want to stream the response from an ADK agent deployed using Vertex AI Agent Engine.

I’ve tried the following code (and some variations of it):

 

from datetime import datetime
from zoneinfo import ZoneInfo

# Assuming these imports are correct
# from adk_client import remote_app
# from adk.core.runner import StreamingMode

# 1. Set the timezone to Ecuador
ecuador_tz = ZoneInfo("America/Guayaquil")

# 2. Get the current time
ecuador_time = datetime.now(ecuador_tz)

# 3. Format it as a string
date_today_str = ecuador_time.strftime("%A, %Y-%m-%d %H:%M:%S")
print(f"Session date and time: {date_today_str}")

# Include the time in the initial state
init_state = {
    "nombre_contacto_ws": "Rick",
    "numero_de_telefono": "+593999163479",
    "fecha_y_hora": date_today_str
}

user_id = "u_007"

remote_session = remote_app.create_session(user_id=user_id, state=init_state)
session_id = remote_session["id"]
print(f"Session created with ID: {session_id}")

print("\n--- Starting SSE event stream ---")

try:
    for event in remote_app.stream_query(
        user_id=user_id,
        session_id=session_id,
        message="Hola",
        streaming_mode="sse"  # I also tried StreamingMode.SSE here
    ):
        print(event)
except Exception as e:
    print(f"An error occurred during streaming: {e}")

print("--- Event stream ended ---")

 

I also tried using StreamingMode.SSE, but it didn’t work either. I’m unable to get a word-by-word streaming response (i.e., token-level streaming). Instead, the output comes in reasoning chunks, just like a normal stream_query call.

I’m starting to suspect that true streaming might not be supported when deploying an ADK agent via Vertex AI Agent Engine.

Can anyone confirm whether this is the case? Or even better, share an example of how to achieve word-by-word streaming if it is indeed possible?

Thanks in advance!




1 7 425
7 REPLIES 7

Have you found the solution yet? If yes, please let me know
I'm also stuck on this

Hi @fnavarro94,

Welcome to Google Cloud Community!

Since ADK agent on Vertex AI Agent Builder is still in preview, it is possible that token-level streaming isn’t fully supported yet and might have limited support.

I recommend submitting an issue report regarding this so that our Engineering Team can look into it. Before filing, please take note on what to expect when opening an issue.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.

I'm having the same issue.

I created the following issue:
https://issuetracker.google.com/issues/429459570

I was able to do this by hosting the adk agent in cloud run not through vertex ai agent engine. 

i.e like this:


https://google.github.io/adk-docs/deploy/cloud-run/#minimal-command

 

 

Thanks for your reply!
I also have it successfully working on Cloud Run.
Still looking forward to have it working on Agent Engine since I would like to leverage the persistent session management. Right now my sessions are lost every time Cloud Run creates a new instance as I didn't set up persistent storage myself, waiting for the feature to come out. 🙂

You can do that with cloud run as well. following this as an example:
https://codelabs.developers.google.com/deploy-manage-observe-adk-cloud-run?hl=es-419#1

all cloud run instances connect to the same db that manages sessions and so it is scalable.  

I did this because I found that vertex ai did not handle concurrent requests well. This cloud run deployment does very well.