Hi,
Is it possible create a custom assertion passing column and list of accepted values as parameter to check values to column?
DBT has a test called accepted_values that check if column match the only values got.
Is it possible customize a assertion that do the same check?
Solved! Go to Solution.
Dataform does not have a directly equivalent "accepted_values" test as found in DBT. However, you can achieve a similar behavior and create custom assertions using these approaches:
1. Using rowConditions
in table configurations:
If the table or view you want to check is defined within your Dataform code (dataform.json or .sqlx files), you can use rowConditions
assertions. This allows you to write SQL expressions that should evaluate to true for all rows.
config {
type: "table",
assertions: {
rowConditions: [
"`<your_column>` IN ('value1', 'value2', 'value3')" // Replace with actual column name and accepted values
]
}
}
SELECT
... -- your table creation logic
<your_column>
that is not in the specified list, the assertion will fail.2. Creating Manual Assertions:
For more flexibility or to test tables/views not directly defined in Dataform, you can create manual assertions.
a. Using Dataform SQLX files:
-- File: custom_assertions.sqlx
config {
type: "assertion",
assertionName: "check_column_accepted_values"
}
SELECT *
FROM `<your_dataset>.<your_table>`
WHERE `<your_column>` NOT IN ('value1', 'value2', 'value3');
b. Using JavaScript (Dataform Core):
// In your JavaScript file
assert("check_column_accepted_values")
.query(`
SELECT *
FROM \`<span class="math-inline">\{dataset\}\.</span>{table}\`
WHERE \`${column}\` NOT IN ('value1', 'value2', 'value3')
`);
Important Notes:
Replace placeholders:
<your_dataset>
, <your_table>
, and <your_column>
need to be replaced with the actual names.Dataform Behavior:
Dataform Core vs SQLX: The SQLX method is generally preferred for defining Dataform configurations. JavaScript method is more useful if you need dynamic logic or interaction with external data sources.
Dataform does not have a directly equivalent "accepted_values" test as found in DBT. However, you can achieve a similar behavior and create custom assertions using these approaches:
1. Using rowConditions
in table configurations:
If the table or view you want to check is defined within your Dataform code (dataform.json or .sqlx files), you can use rowConditions
assertions. This allows you to write SQL expressions that should evaluate to true for all rows.
config {
type: "table",
assertions: {
rowConditions: [
"`<your_column>` IN ('value1', 'value2', 'value3')" // Replace with actual column name and accepted values
]
}
}
SELECT
... -- your table creation logic
<your_column>
that is not in the specified list, the assertion will fail.2. Creating Manual Assertions:
For more flexibility or to test tables/views not directly defined in Dataform, you can create manual assertions.
a. Using Dataform SQLX files:
-- File: custom_assertions.sqlx
config {
type: "assertion",
assertionName: "check_column_accepted_values"
}
SELECT *
FROM `<your_dataset>.<your_table>`
WHERE `<your_column>` NOT IN ('value1', 'value2', 'value3');
b. Using JavaScript (Dataform Core):
// In your JavaScript file
assert("check_column_accepted_values")
.query(`
SELECT *
FROM \`<span class="math-inline">\{dataset\}\.</span>{table}\`
WHERE \`${column}\` NOT IN ('value1', 'value2', 'value3')
`);
Important Notes:
Replace placeholders:
<your_dataset>
, <your_table>
, and <your_column>
need to be replaced with the actual names.Dataform Behavior:
Dataform Core vs SQLX: The SQLX method is generally preferred for defining Dataform configurations. JavaScript method is more useful if you need dynamic logic or interaction with external data sources.
Thanks @ms4446, it worked.
@ms4446 wrote:assertionName: "check_column_accepted_values"
What is the significance of "assertionName:"? I can't find anything about it the docs.
Edit: I don't believe it is a valid property.
Hi @DataEngineer ,
Sorry for the confusion. You are correct that assertionName
is not a standard property in Dataform assertion configurations.
To Clarify In Dataform, particularly within SQLX files, assertions are configured using the following elements:
Config Block:
The config { ... }
block is essential for defining the configuration of an assertion. This is where you specify the type of SQLX file and other optional properties.
Type:
It's mandatory to specify type: "assertion"
in the config block. This clearly indicates that the SQLX file is intended to define an assertion.
Database, Schema, Table (Optional):
These properties are optional and can be used to specify the database, schema, and table where the assertion is relevant. However, their necessity depends on the structure of your Dataform project and the specific requirements of your assertion.
SQL Query:
The heart of an assertion is the SQL query. This query should be crafted to return rows only when the assertion's condition is violated, i.e., when the data fails to meet the specified criteria.
Example of a Dataform assertion in a .sqlx
file:
config {
type: "assertion"
// Optional: database = "your_database"
// Optional: schema = "your_schema"
}
SELECT *
FROM `<your_dataset>.<your_table>`
WHERE `<your_column>` NOT IN ('value1', 'value2', 'value3');