We have BigQuery tables partitioned (sharded) by year (table_YYYY) into many tables having same schemas. The tables are huge (terabytes) and numerous(30+) (to build just one) and users do not need to select all the years usually only few of them, sometimes one only.
Users need to filter by year (filtering year dimension or via parameter/filter), potentially selecting multiple years (e.g., 2022, 2023).
I'm looking for the most efficient and maintainable way to achieve this in LookML.
Ideally, if a user selects years "2022,2023," the generated SQL should resemble:
SELECT * FROM table_2022
UNION ALL
SELECT * FROM table_2023
How can we accomplish this dynamic UNION behavior based on the filter selection?
We were playing with liquid parameters and templated filters but looks like with parameters user can only select one year (mono selection).
Hey - Great question!!
For this, we can use the wildcard query syntax.
Now, we need to make sure that we only scan the dates that we need for our analysis using _TABLE_SUFFIX
We can set this up with a dimension_group to use as our filter, similar to the below which is based on _YYYYMMDD tables:
dimension_group: event {
type: time
timeframes: [date, week, day_of_week, month, year]
sql: TIMESTAMP(PARSE_DATE('%Y%m%d', REGEXP_EXTRACT(_TABLE_SUFFIX,r'\d\d\d\d\d\d\d\d'))) ;;
}
Hope this helps.
Regards,
Rob