Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Dataform incremental tables

I think I am surely doing something wrong here. I shall be grateful if someone can guide me.

Please see this pseudo code of Dataform pipeline

 

config {
type: "incremental",
description: "Information from equipments",

}




WITH equipments_table AS (

SELECT
EPS.equip_timestamp AS equip_seen,
EPS.session_id,
EPS.num AS equip_no,

V.my_date,
V.etc_Date,
V.force_abc,
V.force_region,
V.xyz_status
FROM ${ref("EPS_LOGIC_GATES")} EPS
JOIN ${ref('XXX_monthly_report')} V ON EPS.num = V.num
AND DATE(EPS.equip_timestamp) >= DATE_ADD( V.etc_Date , INTERVAL -7 DAY )
AND (DATE(EPS.equip_timestamp) < V.my_date OR my_date IS NULL)
WHERE
TIMESTAMP_TRUNC(EPS.equip_timestamp, DAY) >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(),INTERVAL 7 DAY)

)

SELECT DISTINCT * FROM equipments_table

 


As you can see above:

1) In the config, I have set the type as incremental

 

config {
type: "incremental",
description: "Information from equipments",

}

 

This (below) is what I want to achieve:

1) My table: EPS_LOGIC_GATES is being populated with new data every hour.
2) Every hour I am running a Dataform pipeline (after EPS_LOGIC_GATES is populated).
3) The above is one example of my Dataform table.
4) Above I want the transformation in Dataform to be done ONLY ON RECORDS WHERE TIMESTAMP_TRUNC(EPS.equip_timestamp, DAY) >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(),INTERVAL 7 DAY)
5) All the records where the condition 4 is not satisfied, it means the transformation is already carried out in the previous pipeline run (1 hour ago) and I do not want to touch those records
6) I am redoing/overwriting the transformations in some records as a precautionary measure (I am happy with that) within the 7 day interval

IS the above pseudo code achieving my purpose?

@ms4446 , I shall be grateful if you can help

0 2 1,698
2 REPLIES 2