Hello everybody,
I'm working on a project to migrate data from a PostgreSQL database to Bigquery, and I'm using dataflow to make the connection and composer to orchestrate the entire flow of creating Dataflow jobs, but I'm having a problem with execution cost, because even if the job does not run for more than 10min, I need to perform a very large number of executions, for comparison purposes, follow the numbers.
I have to migrate data from 2000 banks and each bank has 7 tables, which gives approximately 14,000 jobs/day, even if each job costs me 0.07 cents, at the end of a day we are talking about 980 dollars, at the end of a month I will have 29,400 dollars of cost in dataflow.
That's why my request for help, is there any other tool in the GCP package that I can perform this type of service at a lower cost, here are some points that are important for this decision making.
- The 2000 banks have the same 7 tables, just changing the data content in each bank, so it is a customer request to make these 2000 banks become just a single bank with only 7 tables in bigquery.
- The tables in each bank have a small volume, so it is not necessary a very strong machine to execute them.
There are different ways to migrate your data from PostgreSQL to BigQuery, such as the following ones: