SQL Server Data to BigQuery through ETL Dataflow

gabrielmemories · 08-05-2022 09:55 AM

Well the idea is to create an ETL in Dataflow and through a "csv file" create the job to bring a table from SQL Server. How could I do it? I understand that first I have to create a bucket in Cloud Storage, and in this way load the csv there to be read by Dataflow, and from there pass it to BigQuery. But what command should I use in CloudShell to be able to make the connection to SQL Server and create the csv in CloudStorage from the table that I choose in SQL Server and continue with the process in Dataflow? What would be the best practice to achieve this?

josegutierrez

If you are using Cloud SQL you can use the following command in CloudShell to pass your table to a csv file inside a GCS bucket:

gcloud sql import csv INSTANCE URI --database=DATABASE, -d DATABASE --table=TABLE [--async] [--columns=COLUMNS,[COLUMNS,…]] [--escape=ESCAPE] [--fields-terminated-by=FIELDS_TERMINATED_BY] [--lines-terminated-by=LINES_TERMINATED_BY] [--quote=QUOTE] [--user=USER] [GCLOUD_WIDE_FLAG …]

If you are using an instance in your computer of SQL Server Management studio follow the next steps:

1.- Run Powershell as administrator.

2.- Run the following command in the console:

Invoke-Sqlcmd -Query "SELECT * from <database_name>.<schema_name>.<table_name>;" -ServerInstance "<server_instance>" | Export-Csv -Path "file_ destination_path" -NoTypeInformation

Destination path as: D:\store.csv.

3.- Verify the Csv file in the path you choose.

4.- Upload the csv file to a Bucket.

Yeyi

hello any way to send the data from cloud sql (sql server) to bigquery with dataflow?

Bholoubi

Thank you @josegutierrez . 😊