Hello wonderful people!
I want to build a system that loads data into BigQuery every day. Dataform was first that came to my mind since it's already integrated in the GCP ecosystem and it have cool features like releases & scheduling, version control, etc.
Now, take a look at my code:
config {
type: "operations",
hasOutput: true,
tags: ["raw layer"]
}
LOAD DATA OVERWRITE ${self()}
(
category_id String,
category_name String
)
FROM FILES (
format = 'CSV',
field_delimiter = ',',
skip_leading_rows = 1, -- Skip header row (if present)
allow_quoted_newlines = TRUE, -- Allow newlines within quoted fields
uris = ['gs://${constants.CLIENT_BUCKET_NAME}/categories*.csv']
)
After loading the aforementioned CSV file, I want it to be moved to another folder in the GCS bucket. I have more .sqlx files that are part of the raw layer (bronze layer or whatever it's called these days : )
Any ideas on how to do this using Dataform?
--
Best regards
David Regalado
Web | Linkedin | Cloudskillsboost