Announcements
This site is in read only until July 22 as we migrate to a new platform; refer to this community post for more details.
Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

ref() when generating multiple tables in Dataform

I want to set up a step in my pipeline, where I 

  • export all tables in my "source" layer dataset to GCS (as a daily snapshot)
  • load the most recent file in GCS for each table, to my "staging" layer dataset

I've written this operation to achieve this:

config { type: "operations" }

js {
const now = new Date()
now.setHours(now.getHours() + 9)
const year = `${now.getFullYear()}`
const month = `${String(now.getMonth() + 1).padStart(2, '0')}`
const day = `${String(now.getDate()).padStart(2, '0')}`
const hour = `${String(now.getHours()).padStart(2, '0')}`
const minute = `${String(now.getMinutes()).padStart(2, '0')}`
const second = `${String(now.getSeconds()).padStart(2, '0')}`
const timestamp = `${year}${month}${day}_${hour}${minute}${second}`
}
for record in (
select
concat("source_dataset.", table_name) as source_table_id
, concat("stg_dataset.", table_name) as stg_table_id
, concat(
'gs://${dataform.projectConfig.vars.snapshot_bucket_name}/'
, table_name
, '/${timestamp}/*.parquet'
) as gs_uri
from `region-asia-northeast1`.INFORMATION_SCHEMA.TABLES
where table_schema = "source_dataset"
)
do
execute immediate ("""
export data
options (
uri = '""" || record.gs_uri || """',
format = 'parquet',
overwrite = false)
as (
select *
from """ || record.source_table_id || """
);
""");

execute immediate ("""
load data overwrite""" || record.stg_table_id || """
from files (
format = 'parquet',
uris = ['""" || record.gs_uri || """']
);
""");
end for
 
However, I am wondering how I can reference each staging table in my downstream files. Since I do not generate each table with a SELECT statement, I cannot use `publish()` to define them, which means I cannot use `ref()` in my downstream SQLX files.
Currently I am using `dependencies` in the config block to manage the graph, but I am wondering if there is a better way.
 
0 0 222
0 REPLIES 0