Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Share the DB connection pool across all Dataflow workers

We have 50 Google Cloud SQL instances, each running PostgreSQL. Each of these instances contains 100 individual databases.

Each Cloud SQL instance has a maximum connection limit of 1000 connections.

We are using Dataflow with a worker limit of 1000, and each Dataflow worker is configured with a connection pool using HikariCP. Specifically, each worker has a connection pool of size 10 for every database within the 50 Cloud SQL instances.

In the event of Dataflow scaling up, a new connection pool can be created for the same DB on the new worker machine, potentially creating additional connections to the Cloud SQL instances.



I have the following queries:

1) Can we share this connection pool along with its state across all the workers of Dataflow?

2) Is there any option in Dataflow to share objects across the workers of Dataflow?

3) Can we use side inputs to share the object between workers.

0 1 959
1 REPLY 1