Hi All,
I've the following target: minimize the Mbytes used by Bigquery to load an incremental table.
This is the case:
Source table is a view which select only new added data from a source table:
create or replace SOURCE_VIEW
as
select *
from SOURCE_TABLE
WHERE ingestionTime >= maxIngestionTime
I've partitioned the target table on a date field which should be a good indicator if a record could be modified or not
config {
name : "whmovementTEST",
schema : "ptestds1_bronze",
type : "incremental",
uniqueKey : ["key1", "key2"],
bigquery: {
partitionBy: "TIMESTAMP_TRUNC(compDate, MONTH)"
}
}
I'd like that dataform generates in incremental mode a merge code like following:
MERGE into ${self()} AS target
USING (
select *
from ${ref("SOURCE_VIEW")}
) as source
ON target.key1= source.key1
AND target.key2= source.key2
${when(incremental(),
`AND target.comDate> (SELECT TIMESTAMP_ADD(MAX(compDate), -1, MONTH) FROM ${self()})`)}
}
WHEN MATCHED UPDATE ....
WHEN NOT MATCHED INSER ....
to scan only some of the partions on target table
How to customize the "ON" condition ? There could be an other way to reach the same purpose?
Thanks
have you tried the updatePartitionFilter option in the config block ?