Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Pivot data in Dataform with EXECUTE IMMEDIATE statements

EDIT: I've figured out how I could do this using operations within Dataform. Article here.

Hi everyone,

I have a data asset that needs to be computed every day (hencewhy I'm using dataform - it's a great tool), but I've come across a limitation. The computation requires pivoting data. The dataset looks like this:

InstitutionYearResult
Glasgow20225
Glasgow20231
Aberdeen20236
Aberdeen20226

I need to pivot it such that it looks like this

Institution2022 Result2023 Result
Glasgow51
Aberdeen66

It's easy enough to do in the BigQuery editor using dynamic pivot statements as this blog outlines. Super useful.

The problem is that I can't run EXECUTE IMMEDIATE FORMAT() statements in Dataform. I tried to dynamically reference my tables wrapped within a pivot() function I wrote:

 

 

let pivot = (
    schemaRef,
    tableName,
    aggregatedColumn,
    pivotColumn
) => {

    let refString = `
        ref({
            schema: ${schemaRef},
            name: "${tableName}"
        })`
return `EXECUTE IMMEDIATE FORMAT("""
  SELECT * FROM ${refString}
  PIVOT (SUM(${aggregatedColumn}) as ${aggregatedColumn} FOR ${pivotColumn} IN %s);
""", (SELECT CONCAT("(", STRING_AGG(DISTINCT CONCAT("'", ${pivotColumn}, "'"), ','), ")") FROM ${refString}))
`
};

module.exports = {
    pivot
};

 

 

 And with the SQLX looking like this:

 

 

config {
    type: "table",
    description: "Each record represents a response within typeform.",
    schema: dataform.projectConfig.vars.typeform_intermediate_schema,
    tags: ["typeform_mart"]
}

${
      pivot.pivot(schemaRef = schema,
          tableName = "response_counts",
          aggregatedColumn = "response_count",
          pivotColumn = "form_year")
  }

 

 

But the issue is that the ref() string can only be used within an SQLX context. Big issue, because I need this to run after the tables prior "response_counts" have materialised as a part of the data pipeline.

Any help would be appreciated!

 

1 4 2,305
4 REPLIES 4