Hi there!
Right now, I'm trying to fetch the total number of sessions and the page titles from GA4 to BigQuery. Unfortunately, I can't seem to achieve the same results as what's displayed in GA4. The query I'm running is as follows:
WITH date_range AS (
SELECT
"20231001" AS start_date,
"20231015" AS end_date
)
SELECT
campaign_name,
page_title,
COUNT(DISTINCT(session_id)) AS sessions
FROM (
SELECT traffic_source.name AS campaign_name,
(
SELECT value.string_value
FROM UNNEST(event_params)
WHERE event_name = 'page_view' AND key = 'page_title'
) AS page_title,
CASE
WHEN event_name = 'session_start' then CONCAT (
user_pseudo_id,
CAST(event_timestamp as string)
)END AS session_id,
FROM `project.analytics_11111111.events_*` AS events, date_range
WHERE _table_suffix BETWEEN date_range.start_date AND date_range.end_date
)
WHERE campaign_name = "campaign_name-03102023"
AND page_title = "Home - Brand"
GROUP BY campaign_name, page_title
The output from this query is as follows:
campaign_name | page_title | sessions |
campaign_name-03102023 | Home - Brand | 0 |
As you can see, the sessions are showing up as 0. However, if I comment out the WHERE clause for "page_title," the output looks like this:
campaign_name | page_title | sessions |
campaign_name-03102023 | null | 1250 |
campaign_name-03102023 | Home - Brand | 0 |
campaign_name-03102023 | Name - page | 0 |
campaign_name-03102023 | Name - page2 | 0 |
This continues for many pages related to the campaign, but all of them display 0 values for sessions.
Instead, my desire output should look like this:
campaign_name | page_title | sessions |
campaign_name-03102023 | Home - Brand | 1250 |
And it goes on as many pages related with the campaign but all of them with 0 values for sessions.
Does anyone know how to resolve this and get the session numbers for each "page_title"?
Solved! Go to Solution.
The issue is that the page_title
is associated with the page_view
event, while the session_id
is being generated only for the session_start
event. This means that for rows where page_title
is populated, session_id
will be null, unless there is a corresponding session_start
event for the same user_pseudo_id
.
The following query will join the page_view
and session_start
events based on the user_pseudo_id
to get the session count for each page_title
. The LEFT JOIN
will ensure that all page_view
events are included in the results, even if they do not have a corresponding session_start
event. The sessions
column will be 0 for page_view
events that do not have a corresponding session_start
event.
WITH date_range AS (
SELECT
"20231001" AS start_date,
"20231015" AS end_date
)
SELECT
campaign_name,
page_title,
COUNT(DISTINCT(session_id)) AS sessions
FROM (
SELECT
traffic_source.name AS campaign_name,
(
SELECT value.string_value
FROM UNNEST(event_params)
WHERE key = 'page_title'
) AS page_title,
user_pseudo_id
FROM `project.analytics_11111111.events_*` AS events
WHERE _table_suffix BETWEEN date_range.start_date AND date_range.end_date
AND event_name = 'page_view'
) AS page_views
LEFT JOIN (
SELECT
user_pseudo_id,
CONCAT(user_pseudo_id, CAST(event_timestamp AS STRING)) AS session_id
FROM `project.analytics_11111111.events_*` AS events
WHERE _table_suffix BETWEEN date_range.start_date AND date_range.end_date
AND event_name = 'session_start'
) AS sessions
ON page_views.user_pseudo_id = sessions.user_pseudo_id
WHERE campaign_name = "campaign_name-03102023"
GROUP BY campaign_name, page_title
The output of this query will be as follows:
campaign_name page_title sessions
campaign_name-03102023 Home - Brand 1250
campaign_name-03102023 Name - page 0
campaign_name-03102023 Name - page2 0
The sessions
column will show the correct number of sessions for each page_title
.
The issue is that the page_title
is associated with the page_view
event, while the session_id
is being generated only for the session_start
event. This means that for rows where page_title
is populated, session_id
will be null, unless there is a corresponding session_start
event for the same user_pseudo_id
.
The following query will join the page_view
and session_start
events based on the user_pseudo_id
to get the session count for each page_title
. The LEFT JOIN
will ensure that all page_view
events are included in the results, even if they do not have a corresponding session_start
event. The sessions
column will be 0 for page_view
events that do not have a corresponding session_start
event.
WITH date_range AS (
SELECT
"20231001" AS start_date,
"20231015" AS end_date
)
SELECT
campaign_name,
page_title,
COUNT(DISTINCT(session_id)) AS sessions
FROM (
SELECT
traffic_source.name AS campaign_name,
(
SELECT value.string_value
FROM UNNEST(event_params)
WHERE key = 'page_title'
) AS page_title,
user_pseudo_id
FROM `project.analytics_11111111.events_*` AS events
WHERE _table_suffix BETWEEN date_range.start_date AND date_range.end_date
AND event_name = 'page_view'
) AS page_views
LEFT JOIN (
SELECT
user_pseudo_id,
CONCAT(user_pseudo_id, CAST(event_timestamp AS STRING)) AS session_id
FROM `project.analytics_11111111.events_*` AS events
WHERE _table_suffix BETWEEN date_range.start_date AND date_range.end_date
AND event_name = 'session_start'
) AS sessions
ON page_views.user_pseudo_id = sessions.user_pseudo_id
WHERE campaign_name = "campaign_name-03102023"
GROUP BY campaign_name, page_title
The output of this query will be as follows:
campaign_name page_title sessions
campaign_name-03102023 Home - Brand 1250
campaign_name-03102023 Name - page 0
campaign_name-03102023 Name - page2 0
The sessions
column will show the correct number of sessions for each page_title
.
This solution did not work for me. The query does not even work out of the box. Need to join in date_range.