Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

In GCP BigQuery Datafrom, how to perform insert and update operations on same incrementat table?

pnr
Bronze 3
Bronze 3

Hi Team,

I have source table which contains the change stream, I'm trying to replication the table in BigQuery using Dataform. For that I want to read the records from change stream table and insert or update records in the destination table.

When I tried creating two output files one containing insert and other update publish function I get the error

Duplicate action name detected. Names within a schema must be unique across tables, declarations, assertions, and operations 

Insert.js ->

publish("car_table", {type: "incremental"})
.query(ctx => {
    `SELECT src.id, src.brand, src.count FROM ${ctx.ref("car_table_changelog")} AS src WHERE src._metadata_spanner_mod_type="INSERT"`
});

Update.js ->

publish("car_table", {type: "incremental"})
.query(ctx => `SELECT src.id, src.brand, src.count FROM ${ctx.ref("car_table_changelog")} AS src WHERE src._metadata_spanner_mod_type="UPDATE"`);

How can I achieve it? Can I combine both the SQL statements into one publish function?

Solved Solved
0 9 2,689
2 ACCEPTED SOLUTIONS

Below are some resources and example projects that can help you understand and work with Dataform:

  1. Dataform's official documentation and guides: This is your best starting point as it provides comprehensive details, tutorials, and best practices.

  2. Dataform's GitHub repository: This contains sample projects, integrations, and more. It's a useful resource to see real-world setups and understand potential advanced configurations.

  3. Example Projects:

  4. Community Resources:

    • Dataform's Community Slack: This is a great place to ask questions, share experiences, and learn from other Dataform users.

    • Dataform Blog: Contains a variety of articles, from best practices to in-depth explanations of specific features.

View solution in original post

 

A sample script to run the MERGE statement in BigQuery can be written in either SQL or a programming language such as JavaScript.

Here is a sample SQL script:

MERGE INTO car_table AS target
USING car_table_changelog AS source
ON target.id = source.id
WHEN MATCHED AND source._metadata_spanner_mod_type = 'UPDATE' THEN
UPDATE SET
target.brand = source.brand,
target.count = source.count
WHEN NOT MATCHED AND source._metadata_spanner_mod_type = 'INSERT' THEN
INSERT (id, brand, count)
VALUES (source.id, source.brand, source.count);

This script can be run in the BigQuery console or using the BigQuery API.

Here is a sample JavaScript script to run the MERGE statement in BigQuery:

const { BigQuery } = require('@google-cloud/bigquery');

const bigquery = new BigQuery();

const query = `
MERGE INTO car_table AS target
USING car_table_changelog AS source
ON target.id = source.id
WHEN MATCHED AND source._metadata_spanner_mod_type = 'UPDATE' THEN
UPDATE SET
target.brand = source.brand,
target.count = source.count
WHEN NOT MATCHED AND source._metadata_spanner_mod_type = 'INSERT' THEN
INSERT (id, brand, count)
VALUES (source.id, source.brand, source.count);
`;

async function runQuery() {
const job = await bigquery.createQueryJob({
query,
location: 'US',
});

const [result] = await job.getQueryResults();

console.log(result);
}

runQuery();

This script can be run using Node.js.

Which type of script you choose will depend on your specific needs and preferences.

 

View solution in original post

9 REPLIES 9