Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

CloudRun to CloudSQL connection randomly failing the set up socket connection

On march, 1st at 2am (GMT) I started seeing error reports for a failed database socket connection from CloudRun Jobs to CloudSQL (Postgres), missing the socket.

Some investigation led me to following notice (not error) in the logs, starting at the same date/time.

```
failed to refresh the ephemeral certificate for XXX: Post "https://sqladmin.googleapis.com/sql/v1beta4/projects/xxx/instances/xxx:generateEphemeralCert?alt=jso...": oauth2/google: can't get a token from the metadata service; not running on GCE
```

The jobs are deployed with `gcloud run jobs replace` command using knative templates including `run.googleapis.com/cloudsql-instances` annotation.

This error applies to both my projects and all jobs. It was working and I haven't changed anything.

This is not happening for every Job execution, but every few minutes. It seems random as I couldn't see a pattern.

4 6 1,947
6 REPLIES 6

Hi @robinboening,

Welcome to Google Cloud Community!

Can you provide your logs so that we could have an idea on why this is happening?

Also, can you check your IAM permissions for Cloud SQL as Cloud SQL Client should be enabled?

You can check this documentation on connecting Cloud SQL from Cloud Run. 

In addition, you may file this as an issue if the suggestions I provided above did not work.

Hi @robinboening , I also noticed something similar on my cloud run revision yesterday. randomly a few instances started failing to connect to the Cloud SQL Admin API with a timeout > 10s. The issue persisted, only sometimes on some instances, for a few hours until I issued a new revision (with identical parameters). Did you ever figure out what was causing your problem?

Hi @robertcarlos , I'm experiencing the same issue as well. Our cloud run lost DB connections occasionally, and I see it failed to refresh the ephemeral certificate before the connection could not be set up. We have the `Cloud SQL Admin API` enabled as well. Any idea why we are still facing this issue?

Any new info on this question? I'm seeing something similar happen.

I'm experiencing the same issue:

Cloud SQL connection failed. Please see https://cloud.google.com/sql/docs/mysql/connect-run for additional details: connection to Cloud SQL instance at 34.xxx.xxx.145:3307 failed: timed out after 10s


My setup is Cloud Run node.js app connected to Cloud Sql Postgres instance. The Cloud Run service was running fine with the same setup. Without any change or new revision deployment, it started failing.

I managed to hotfix it by deploying a new revision to a new region.

Here's a bug report from today: https://issuetracker.google.com/issues/416099577

Top Solution Authors