How to filter a measure with HLL Sketch fields

Some customers create fields in BigQuery with HLL (HyperLogLog functions), which create a sketch of a count or value, that will be confirmed during post processing. This post-processing in Looker involves performing an aggregation over the field with a format like: HLL_COUNT.MERGE().

This left us with a challenge, since Looker does not perform the aggregation of an aggregation, which is what we get with this measure definition:

measure: hll_based_measure {
  type: sum    (Aggregation 2)
  sql: HLL_COUNT.MERGE( ${view_name.hll_field});;   (Aggregation 1)
  filters: [filter_field: "Yes"]
}

But, if placing "type: number" to the measure, the filter cannot be applied, which generates another issue. 

Meanwhile, we solved this issue by leaving the measure as "type:number" and replacing the filter by a CASE WHEN clause in the "sql" parameter.

measure: hll_based_measure {
  type: number
  sql: HLL_COUNT.MERGE(CASE WHEN ${filter_field_yesno} THEN  ${view_name.hll_field} ELSE NULL END);;
}

After testing, this brings out the correct numbers, but might face issues when drilling in/out in the measure that is important to take int account. Otherwise, is a solution that might be helpful whan faced with HLL Sketch metrics.

1 1 279
1 REPLY 1

 Hello MariaALoeber, thanks for sharing your experience with HyperLogLog functions with BigQuery and Looker - that's an interesting space.

I understand the challenge you describe.  In the past when faced with similar situations, I have utilized hidden measures to construct the desired drill behavior/link, and then utilize that drill link on the more complicated/custom measure that will actually be shown to end users.  I tried this on a mockup similar to what you describe and I think this will work and maintain the optimal/intuitive drill behavior (i.e. passing measure filter criteria down).

measure: hll_based_measure_with_yesno_criteria {
type: number
sql: HLL_COUNT.MERGE(CASE WHEN ${filter_field_yesno} THEN ${hll_field} ELSE NULL END);;
link: {label:"drill" url:"{ {drill_with_filter_field_helper._link} }"}#EXTRA SPACE ADDED TO LIQUID _LINK, BECAUSE THIS COMMUNITY FORUM WAS THROWING AN ERROR DUE TO LIQUID SYNTAX
}
measure: drill_with_filter_field_helper {
hidden: yes
type: count
filters: [filter_field_yesno: "Yes"]
drill_fields: [user_id]
}

I hope this helps you or future readers take full advantage of the HyperLogLog functions with looker.

Top Labels in this Space