Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Cloud Composer monitoring pod connection timeout

Hello,

I've created a few Composer environments through Terraform (same config, different projects). For the most part, a very smooth process, but I have one particular environment that's stuck with the following error in the "airflow-monitoring" GKE pod:

"[WARNING] connectionpool.py:812 Retrying (Retry(total=2, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x7f94903c2850>, 'Connection to <cluster external endpoint IP address redacted> timed out. (connect timeout=None)')': /api/v1/namespaces/composer-system/pods/airflow-monitoring-<random numbers>"

I tried upgrading the composer/airflow image, restarting by changing params, but have not had success getting Composer to run metrics and show a healthy web server or db in this project. The environment itself and DAGs are all working perfectly fine, but I'd like the monitor to work.

Since the monitor process is trying to reach the external IP for the cluster, likely a networking issue, but not sure why it works everywhere else except for this specific project.

I would really appreciate any troubleshooting guidance.

Thanks!

Solved Solved
0 2 1,337
1 ACCEPTED SOLUTION

I figured it out. For some reason, the "Control plane authorized networks" was different across projects. They all have our VPN gateway IP as the only authorized network, but "Access through Google Cloud public IP addresses" was disabled for the failing cluster/pod.

View solution in original post

2 REPLIES 2