Why does blended data show different totals compared to manually summed values?

Hello everyone,

I hope you're doing well.

I have four separate Google Analytics accounts (Belgium, France, Germany, and NL) that I blended in Looker Studio using a Full Join. My goal is to calculate total metrics such as Total Users and Total Sessions across all four sources.

However, when I compare the blended total with the sum of the individual totals (calculated manually by adding up the values from each source separately), the numbers do not match.

I suspect it might be due to duplication, missing values, or the way aggregation works in Looker Studio, but I’m not sure. Here are some details:

  • The blending key I used is Date.
  • Each Google Analytics source tracks the same website but for different regions.
  • Some channel groups (e.g., "Organic Shopping") exist in one source but not in others.

My questions:

  1. What could be causing the discrepancy between the blended data total and the manually summed values?
  2. Is Full Join the right approach, or should I use a different type of blending (e.g., Union or Left Join)?
  3. How does Looker Studio handle metric aggregation in blended sources, and is there a way to force it to sum correctly?

Any insights or best practices would be greatly appreciated.

Thank you in advance for your help!

Best regards,

0 1 655
1 REPLY 1

Metric aggregation can absolutely be duplicated in blended sources.

Consider these example tables:

Date Store Revenue
12-30-2024 A 500
12-30-2024 B 600

 

Date Product Count sold
12-30-2024 X 30
12-30-2024 Y 40

Now, if we join these together on date (the join type, full/outer/left/right/inner, won't make a difference here since all dates are represented in all tables), we'd get the following table:

date Store Revenue Product Count Sold
12-30-2024 A 500 X 30
12-30-2024 A 500 Y 40
12-30-2024 B 600 X 30
12-30-2024 B 600 Y 40

There's now a row for each combination of store and product. (AX, AY, BX, BY). If you were to sum one of the numeric columns, your numbers would be double what you'd expect. So if you knew this was going to happen, you could write in your own sum that divides by the duplication. For revenue, you'd want to divide by the number of products, and for count sold, you'd want to divide by the number of stores. For example: SUM(revenue)/COUNT_DISTINCT(products).

Anyways. I hope that helps address questions 1 and 3. For your second question, the join type won't matter much here, but it does sound like you want your data to be unioned more than joined. If your four data sources were represented as data tables, would you want to just append each table to the bottom of the previous table? If so, a UNION is what you want.