Hi,
Following this conversation, I decided to put my case as a new message here. My goal is to load a table with data from CSV files using Dataform. For that, I'm using a sqlx file like this::
LOAD DATA OVERWRITE
[MY_PROJECT].ILS.test_david_ILS_categories ( category_id String,
category_name String )
FROM FILES ( format = 'CSV',
uris = ['gs://[PROJECT_WITH_DATA_SOURCES]/categories_2024_02*.csv']);
SELECT
*
FROM
[MY_PROJECT].ILS.test_david_ILS_categories
That is working fine. But when I add the config block, it fails when executing as a workflow. This is how it looks the sqlx file with the config block:
config {
type: "table"
}
LOAD DATA OVERWRITE
[MY_PROJECT].ILS.test_david_ILS_categories ( category_id String,
category_name String )
FROM FILES ( format = 'CSV',
uris = ['gs://[PROJECT_WITH_DATA_SOURCES]/categories_2024_02*.csv']);
SELECT
*
FROM
[MY_PROJECT].ILS.test_david_ILS_categories
In the details of the workflow, I'm getting this error: Syntax error: Unexpected keyword LOAD at [31:1]
BEGIN
CREATE SCHEMA IF NOT EXISTS `[MY_PROJECT].ILS` OPTIONS(location="US");
EXCEPTION WHEN ERROR THEN
IF NOT CONTAINS_SUBSTR(@@error.message, "already exists: dataset") AND
NOT CONTAINS_SUBSTR(@@error.message, "too many dataset metadata update operations") AND
NOT CONTAINS_SUBSTR(@@error.message, "User does not have bigquery.datasets.create permission")
THEN
RAISE USING MESSAGE = @@error.message;
END IF;
END;
BEGIN
DECLARE dataform_table_type DEFAULT (
SELECT ANY_VALUE(table_type)
FROM `[MY_PROJECT].ILS.INFORMATION_SCHEMA.TABLES`
WHERE table_name = 'test_david_ILS_categories'
);
IF dataform_table_type IS NOT NULL THEN
IF dataform_table_type = 'VIEW' THEN DROP VIEW IF EXISTS `[MY_PROJECT].ILS.test_david_ILS_categories`;
ELSEIF dataform_table_type = 'MATERIALIZED VIEW' THEN DROP MATERIALIZED VIEW IF EXISTS `[MY_PROJECT].ILS.test_david_ILS_categories`;
END IF;
END IF;
BEGIN
CREATE OR REPLACE TABLE `[MY_PROJECT].ILS.test_david_ILS_categories`
OPTIONS(description='''A table description here''')
AS (
LOAD DATA OVERWRITE
[MY_PROJECT].ILS.test_david_ILS_categories ( category_id String,
category_name String )
FROM FILES ( format = 'CSV',
uris = ['gs://[PROJECT_WITH_SOURCE_DATA]/categories_2024_02*.csv']);
SELECT
*
FROM
[MY_PROJECT].ILS.test_david_ILS_categories
);
END;
END;
I suppose that Stored Procedure was expecting a SELECT statement. Is there a way to accomplish the table creation from a CSV directly from Dataform?
--
Best regards
David Regalado
Web | Linkedin | Cloudskillsboost
Solved! Go to Solution.
See the documentation, you should use `operations` type in the config block (with hasOutput=true). Or you could keep `table` and wrap the LOAD statement in the `pre_operations` block.
See the documentation, you should use `operations` type in the config block (with hasOutput=true). Or you could keep `table` and wrap the LOAD statement in the `pre_operations` block.
I just figure that out, but thanks for your answer. The operations type did the trick!
--
Best regards
David Regalado
Web | Linkedin | Cloudskillsboost