Hi community,
I have a question regarding usage of variables in dataform.
According to the documentation [https://cloud.google.com/dataform/docs/reuse-code-includes] we can declare a variable with "const" and then we can import it and use it across different actions.
But is there any way we can modify the value of the variable mid pipeline and then use the updated value in further actions?
I am trying to achieve the following
> I have a pipeline in which the data is processed in 3rd step and stored into a table "order_details" and I want to store the max & min "orderID" in a variable.
> There are 30 other steps later in the pipeline and I want to use the stored variable in a query basically with "WHERE orderId >= var_min_orderID AND orderId <= var_max_orderID"
Currently we have to query the "order_details" table everytime to get the max & min value in each step and that operation is expensive so we were looking if we can use some variable and store it in the 3rd step and later on use that variable instead of querying again & again.
Any help would be appreciated.
Thanks
Hi @rahulmexe,
Welcome to the Google Cloud Community!
Based on my understanding, your goal is to dynamically update a variable (orderID
) in step 3 and utilize the updated value in the subsequent steps to improve efficiency by reducing repeated queries.
You're correct that using const
allows values to be reused throughout your project. However, in JavaScript, once a value is assigned to a constant variable, it cannot be modified or reassigned.
Another approach you might consider is using a materialized view. By creating a materialized view in Dataform, you can store the minimum and maximum orderID
values from the order_details table. This view will automatically update to reflect any changes made to the base table, ensuring that your data remains current without the need for manual updates. You can use this in your subsequent steps instead of repeatedly querying the order_details table.
I hope the above information is helpful.