Hey guys
So I’m currently experiencing an interminable integration issue between a Google Cloud function (Node.js, running Express, orchestrating Puppeteer for web scraping) and Airflow running on Cloud Composer. Here’s my setup and issue in brief:
Setup:
Cloud Function: Node.js 18, Express, accepts POST JSON body, downloads dynamically imports JS file from Cloud Storage, runs Puppeteer to scrapefrom a platform & saves files, processes data (convert CSV to JSON, uploads data to BigQuery.
Airflow (Cloud Composer): Makes use of Python’s google.auth.transport.requests.AuthorizedSession with ID token auth to invoke the Cloud Function synchronously (POST, with a 1-hour timeout).
Issue:
For small jobs (a few iterations of Puppeteer), HTTP 200 is sent back to Airflow and it proceeds normally.
For large jobs, the logs show everything completes as expected, and res.status(200).send(...) or .json(...) is called at the end — but HTTP response is never received by Airflow. It waits until its own timeout. Cloud Function finished — logs confirm response was sent.
No errors are thrown in either Airflow or Cloud Function, and all Node.js file handles and promises appear resolved at the end. Printing process._getActiveHandles() and process._getActiveRequests() shows only normal items (sockets, short-lived FSReqCallback).
What I've tried:
await-ed all async operations
Nothing was open using why-is-node-running, and running under a test harness leaked no more than without the patch.
Content-Length was forced in the response. Processing was broken into shorter files, Puppeteer loops.
Memory & CPU both increased for the Cloud Function and Composer. Newer requests, google-auth, etc Python libraries.
Shuffling between res.json, res.send, and plain HTTP responses.
Executed on GCP environment; occurs only with heavy jobs and in Cloud Functions. Cloud Function not bailing out early (all logs post completion). What else can cause a CF (Node.js/Express) to appear to finish and send HTTP response, but for the caller (Airflow/Composer) to never receive that response?
Any known bug or limitation in GCF or Cloud Composer/AuthorizedSession that might be causing this behavior for heavy/long-running synchronous requests?