Hey y'all
I'm trying to configure Datastream to enable streaming data from CloudSQL to the BigQuery.
Our CloudSQL instance in the project-A, Datastream in the project -B.
Project-A has vpc-A which is peered with vpc-B in the Project-B.
vpc-A has cloudsql_auth proxy and I'm able to ping or login to this database via cloudsql_auth_proxy using psql from VPC-B(Project-B) but I'm not able to connect to the cloudsql_auth_proxy from the Datastream. I need to be able to connect Datastream(Project-A) to the CloudSQL(Project-B) using private connection
Thank you!
Hi @realsharip,
Welcome to Google Cloud Community!
Actually, Datastream is a serverless product, so it isn't physically in the project B VPC-B.
And because VPC peering isn't transitive, peering Datastream > VPC-B > -VPC-A > Cloud SQL doesn't work. To get around this, you'll need to either:
a) set up an additional reverse proxy in VPC-B to forward the traffic to the AuthProxy in VPC-A (which will then connect to Cloud SQL)
or
b) set up a Shared VPC, peer Datastream to it, and put the reverse-proxy there (and I think in this case you don't need the additional AuthProxy, just point the proxy at the Cloud SQL database?)
In either case, make sure that Datastream is pointing to the proxy's IP, not the database's.
(DISCLAIMER: I'm not a networking expert, so I might be missing something)