Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Fastapi StreamingResponse on Cloud Run

Hello all,

I'm trying to have a streaming response on Google Cloud Run with the Fastapi StramingResponse class.

This is the endpoint that takes around 5-15 seconds to execute. 

 

 

 

from fastapi.responses import StreamingResponse

@app.get("/places/{place_id}/all-stream")
async def get_all_stream(place_id: str):
    return StreamingResponse(getAllStream(place_id), media_type="text/event-stream")

 

 

 



The data it transmits are three strings. If I try it locally it correctly streams the strings as soon as they are processed. But when I deploy it, it looks like it buffers the response and sends it only when it is completely processed. 

I can't understand why this happens. I'm avaiable for further clarifications.

Thank you in advance.

Bests,
Tommaso
0 2 379
2 REPLIES 2

Hi @tommyiaqmenumal,

Welcome to the Google Cloud community!

This is common when using HTTP/2, which is the default on Cloud Run because Google Cloud Run buffers the response before sending it.

Here are the workarounds based on Google Cloud Documentation:

  • HTTP/2 and Response Buffering: Cloud Run uses HTTP/2 by default, and HTTP/2 typically requires the entire response to be prepared before it is sent. This can result in buffering of your data until the entire stream is ready to be delivered.
  • Increase Cloud Run Timeout: If your function is taking longer than the default timeout, increase the timeout limit for your Cloud Run service. The default timeout is 15 minutes, but you can extend it up to 60 minutes.
  • FastAPI Configuration: Make sure you're using the latest versions of FastAPI to ensure proper async handling. (Please note to utilize this link with caution since this is not maintained by Google and could be inaccurate or outdated.)
  • Configure Cloud Run Resources: Check your Cloud Run resource configuration (memory and CPU allocation) to ensure that your service has enough resources to handle the streaming request effectively.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.

Thanks for your answer:

  • HTTP/2 and Response Buffering: So the problem is the buffering, can I disable it?
  • Increase Cloud Run Timeout: It is strange, the function takes around 15 seconds, sending a chunk every 5. The problem is that I have to add additional chunks so it is not possible to have such a slow response. I need to have streaming, it is supported on AWS.
  • FastAPI Configuration: I'm using the last fastapi module and it works properly locally, so I guess it is configured correctly.
  • Configure Cloud Run Resources: It is not so heavy on memory, btw I have 2gb ram and 1 cpu. With is more than enough for 1 instance but it does not work.

    Is there any workaround? Or should I swith to some other cloud services?