I am in trouble because the numbers do not match when comparing the data in GA4 and exported BQ. If you have any clues, I would appreciate it if you could let me know.
【premise】
Im running e-commerce site, and using GA4. Also, I set the export from GA4 to BigQuery using GA4 export function. (https://support.google.com/analytics/answer/9358801)
【overview】
Using GA4's search tool, we created a report that counts the number of sessions for each referrer. Next, in BigQuery to which the GA4 data was exported, the number of sessions was aggregated for each session referrer. Comparing the GA4 report with the BQ results, the number of sessions for a particular session referrer is significantly lower in the BQ results.
【detail】
When aggregating the number of sessions for each referrer of sessions in BigQuery, the key of event_params was grouped by the string_value of source.
[SQL on BQ]
select source ,count(distinct session_id) ,sum(ss_session_start)
from(
select
CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'source') as STRING) as source
,CONCAT(user_pseudo_id,"-", CAST((SELECT value.int_value FROM UNNEST(event_params) WHERE key = 'ga_session_id') as STRING)) as session_id
from XXXXXXXX.events_*ga
where 1=1 and _table_suffix = "YYYYMMDD"
)
group by source
order by source
[things I tried]
・Depending on the referrer of the session, the figures were almost the same. For example, BQ app is very different, but insta is almost the same.
・The total number of sessions is almost the same, and there are a large number of null session referrers. → Therefore, in GA4, what is classified as app etc. is null on BQ? I'm guessing
・I tried to aggregate the page_location with the utm_source extracted by myself using REGEX, but the same tendency was observed.
Hi @aki_t ,
I can think of a few reasons why the numbers might not match up between GA4 and BigQuery.
First, it's possible that there are some sessions that are not being recorded in BigQuery. This could be due to a number of factors, such as a problem with the data collection or the data export process.
Second, it's possible that there are some sessions that are being recorded in both GA4 and BigQuery, but the data is being recorded differently in each system. This could be due to a difference in the way the two systems measure sessions or the way they record data about sessions.
Third, it's possible that there are some sessions that are being recorded in BigQuery, but the data is not being aggregated in the same way in BigQuery as it is in GA4. This could be due to a difference in the way the two systems aggregate data or the way they calculate the number of sessions.
If you're not sure why the numbers are not matching up, I would recommend contacting Google Analytics support or BigQuery support for assistance. They may be able to help you troubleshoot the issue and identify the cause of the problem.