How to get sessions with page title ga4 to bigquery

Hi there!

Right now, I'm trying to fetch the total number of sessions and the page titles from GA4 to BigQuery. Unfortunately, I can't seem to achieve the same results as what's displayed in GA4. The query I'm running is as follows:

 

WITH date_range AS (
  SELECT
    "20231001" AS start_date,
    "20231015" AS end_date
)

SELECT
 campaign_name,
 page_title,
 COUNT(DISTINCT(session_id)) AS sessions
FROM ( 
      SELECT traffic_source.name AS campaign_name,
      (
      SELECT value.string_value
      FROM UNNEST(event_params)
      WHERE event_name = 'page_view' AND key = 'page_title'
      ) AS page_title,
      CASE 
          WHEN event_name = 'session_start' then CONCAT (
               user_pseudo_id, 
                CAST(event_timestamp as string)
          )END AS session_id,
      FROM `project.analytics_11111111.events_*` AS events, date_range
      WHERE _table_suffix BETWEEN date_range.start_date AND date_range.end_date
      )
WHERE campaign_name = "campaign_name-03102023"
      AND page_title = "Home - Brand"
GROUP BY campaign_name, page_title

 

The output from this query is as follows:

 

campaign_namepage_titlesessions
campaign_name-03102023Home - Brand0

As you can see, the sessions are showing up as 0. However, if I comment out the WHERE clause for "page_title," the output looks like this:

 

campaign_namepage_titlesessions
campaign_name-03102023null1250
campaign_name-03102023Home - Brand0
campaign_name-03102023Name - page0
campaign_name-03102023Name - page20

This continues for many pages related to the campaign, but all of them display 0 values for sessions.

Instead, my desire output should look like this: 

campaign_namepage_titlesessions
campaign_name-03102023Home - Brand1250

And it goes on as many pages related with the campaign but all of them with 0 values for sessions. 

Does anyone know how to resolve this and get the session numbers for each "page_title"?

 

 

 

 

 

Solved Solved
0 2 893
1 ACCEPTED SOLUTION

 

The issue is that the page_title is associated with the page_view event, while the session_id is being generated only for the session_start event. This means that for rows where page_title is populated, session_id will be null, unless there is a corresponding session_start event for the same user_pseudo_id.

The following query will join the page_view and session_start events based on the user_pseudo_id to get the session count for each page_title. The LEFT JOIN will ensure that all page_view events are included in the results, even if they do not have a corresponding session_start event. The sessions column will be 0 for page_view events that do not have a corresponding session_start event.

WITH date_range AS (
  SELECT
    "20231001" AS start_date,
    "20231015" AS end_date
)

SELECT
  campaign_name,
  page_title,
  COUNT(DISTINCT(session_id)) AS sessions
FROM (
  SELECT
    traffic_source.name AS campaign_name,
    (
      SELECT value.string_value
      FROM UNNEST(event_params)
      WHERE key = 'page_title'
    ) AS page_title,
    user_pseudo_id
  FROM `project.analytics_11111111.events_*` AS events
  WHERE _table_suffix BETWEEN date_range.start_date AND date_range.end_date
  AND event_name = 'page_view'
) AS page_views
LEFT JOIN (
  SELECT
    user_pseudo_id,
    CONCAT(user_pseudo_id, CAST(event_timestamp AS STRING)) AS session_id
  FROM `project.analytics_11111111.events_*` AS events
  WHERE _table_suffix BETWEEN date_range.start_date AND date_range.end_date
  AND event_name = 'session_start'
) AS sessions
ON page_views.user_pseudo_id = sessions.user_pseudo_id
WHERE campaign_name = "campaign_name-03102023"
GROUP BY campaign_name, page_title

The output of this query will be as follows:

 

campaign_name	page_title	sessions
campaign_name-03102023	Home - Brand	1250
campaign_name-03102023	Name - page	0
campaign_name-03102023	Name - page2	0

The sessions column will show the correct number of sessions for each page_title.

View solution in original post

2 REPLIES 2

 

The issue is that the page_title is associated with the page_view event, while the session_id is being generated only for the session_start event. This means that for rows where page_title is populated, session_id will be null, unless there is a corresponding session_start event for the same user_pseudo_id.

The following query will join the page_view and session_start events based on the user_pseudo_id to get the session count for each page_title. The LEFT JOIN will ensure that all page_view events are included in the results, even if they do not have a corresponding session_start event. The sessions column will be 0 for page_view events that do not have a corresponding session_start event.

WITH date_range AS (
  SELECT
    "20231001" AS start_date,
    "20231015" AS end_date
)

SELECT
  campaign_name,
  page_title,
  COUNT(DISTINCT(session_id)) AS sessions
FROM (
  SELECT
    traffic_source.name AS campaign_name,
    (
      SELECT value.string_value
      FROM UNNEST(event_params)
      WHERE key = 'page_title'
    ) AS page_title,
    user_pseudo_id
  FROM `project.analytics_11111111.events_*` AS events
  WHERE _table_suffix BETWEEN date_range.start_date AND date_range.end_date
  AND event_name = 'page_view'
) AS page_views
LEFT JOIN (
  SELECT
    user_pseudo_id,
    CONCAT(user_pseudo_id, CAST(event_timestamp AS STRING)) AS session_id
  FROM `project.analytics_11111111.events_*` AS events
  WHERE _table_suffix BETWEEN date_range.start_date AND date_range.end_date
  AND event_name = 'session_start'
) AS sessions
ON page_views.user_pseudo_id = sessions.user_pseudo_id
WHERE campaign_name = "campaign_name-03102023"
GROUP BY campaign_name, page_title

The output of this query will be as follows:

 

campaign_name	page_title	sessions
campaign_name-03102023	Home - Brand	1250
campaign_name-03102023	Name - page	0
campaign_name-03102023	Name - page2	0

The sessions column will show the correct number of sessions for each page_title.

This solution did not work for me.  The query does not even work out of the box.  Need to join in date_range.