Combine AVG function with DISTINCT records

I'm using a scorecard to calculate the average of a column. The information used for the average is shown in a table. If I compare the data in the table with the average, the calculation of the average does not appear to be correct. When I dive deaper into the data, the problem seems to be caused because the data-source contains "duplicate records". The records aren't really duplicate, but when you take into account the calculation of the average, the diferences aren't relevant. 
Question: Is it possible to calculate the average based on distinct records in the score card
Table in Looker:
In blue is the column I want the average to be calculated on. As you can see, the average of 5+4+4+2 is not 4.5 but 3.75. This caused by multiple occurences in line 1 (8 times) and 3 (10 times) in the datasource.

 Looker_08-20-2024_01.jpg

 

 

Solved Solved
0 4 747
1 ACCEPTED SOLUTION

Hi @GertVH 

Inside Looker Studio, you need to think different. Most of the solutions for aggregation and unaggregation are based on data blending.

You probably need to create a single-table data blending only with dimensions to deduplicate your rows, it will keep only the different rows thanks to a merging of rows sharing the same values (year and your stats  as 2 dimensions if I understood the use case).
Then, use the blending as the data source of your scorecard and apply an AVG on the stat.

So, to sum-up:
- the blending plays the role of DISTINCT
- the chart applies the AVG aggreg on the blended duplicated data

PS: AVG(DISTINCT ...) could work (never tested it but COUNT(DISTINCT ... do) but it is not officially supported, so I don't recommend to use it

Voila.

Mehdi

View solution in original post

4 REPLIES 4

I think Looker Studio would need an AVERAGE_DISTINCT metric type in order to support this. I can't think of any way to work around that, but I do recommend creating a feature request!

Hi @GertVH 

Inside Looker Studio, you need to think different. Most of the solutions for aggregation and unaggregation are based on data blending.

You probably need to create a single-table data blending only with dimensions to deduplicate your rows, it will keep only the different rows thanks to a merging of rows sharing the same values (year and your stats  as 2 dimensions if I understood the use case).
Then, use the blending as the data source of your scorecard and apply an AVG on the stat.

So, to sum-up:
- the blending plays the role of DISTINCT
- the chart applies the AVG aggreg on the blended duplicated data

PS: AVG(DISTINCT ...) could work (never tested it but COUNT(DISTINCT ... do) but it is not officially supported, so I don't recommend to use it

Voila.

Mehdi

Hi Mehdi, 
Thanks for your quick support! The problem seems to be solved!
Thanks
Gert

👍 Good News. Happy to help