I am trying to figure out how to get the GA4 dimension, "source platform" in my Big Query exports. Upon my research it seems that source platform may not have a direct corresponding field in BigQuery export schema, and want to know how to derive it.
Hi,
In the GA4 accounts I manage, I’ve observed that the "source platform" dimension is populated only with Google Ads and Manual values. I recommend using the following query to accurately derive this dimension:
CASE
WHEN session_traffic_source_last_click.google_ads_campaign.customer_id IS NOT NULL THEN 'Google Ads'
WHEN session_traffic_source_last_click.manual_campaign.source_platform IS NULL THEN 'Manual'
ELSE session_traffic_source_last_click.manual_campaign.source_platform
END AS session_source_platform
Regards,
Sinan
Thanks Sinan for your response. We used this query and were able to get some data in.
Do you happen to know how to pull in Session Default Channel Group data? From my understanding we can't just get this directly into our reports like we used to with UA. And now have to derive it from other dimensions such as
Do you know of a quicker way to derive Session Default Channel Group?
If you're looking to pull traffic source data according to the Last Non-Direct Attribution model, you can use the following query:
WITH
traffic_sources_last_non_direct AS (
SELECT
event_date,
CONCAT( user_pseudo_id, '_', (
SELECT
COALESCE( value.int_value )
FROM
UNNEST (event_params)
WHERE
`key` = 'ga_session_id' ) ) AS session_id,
IFNULL(MAX( (
SELECT
COALESCE( value.int_value, CAST(value.string_value AS INT64), CAST(value.double_value AS INT64), CAST(value.float_value AS INT64) )
FROM
UNNEST (event_params)
WHERE
`key` = 'session_engaged' ) ), 0) AS session_engaged,
stream_id,
ARRAY_AGG(session_traffic_source_last_click
ORDER BY
event_timestamp ASC
LIMIT
1) [SAFE_OFFSET(0)] AS session_traffic_source_last_click
FROM
`__PROJECT_ID_DATASET_ID_TABLE_ID__`
GROUP BY
event_date,
session_id,
stream_id )
SELECT
PARSE_DATE('%Y%m%d', event_date) AS date,
CAST(session_id AS STRING) AS session_id,
CAST(stream_id AS STRING) AS stream_id,
IFNULL(COALESCE(session_traffic_source_last_click.google_ads_campaign.campaign_id, session_traffic_source_last_click.manual_campaign.campaign_id), '(not set)') AS session_campaign_id,
IFNULL(COALESCE(session_traffic_source_last_click.google_ads_campaign.campaign_name, session_traffic_source_last_click.manual_campaign.campaign_name), '(not set)') AS session_campaign_name,
CASE
WHEN session_traffic_source_last_click.google_ads_campaign.customer_id IS NOT NULL THEN 'google'
WHEN session_traffic_source_last_click.manual_campaign.source = '(not set)' THEN NULL
ELSE session_traffic_source_last_click.manual_campaign.source
END
AS session_source,
CASE
WHEN session_traffic_source_last_click.google_ads_campaign.customer_id IS NOT NULL THEN 'cpc'
WHEN session_traffic_source_last_click.manual_campaign.medium = '(not set)' THEN NULL
ELSE session_traffic_source_last_click.manual_campaign.medium
END
AS session_medium,
IFNULL(session_traffic_source_last_click.manual_campaign.term, '(not set)') AS session_term,
IFNULL(session_traffic_source_last_click.manual_campaign.content, '(not set)') AS session_content,
CASE
WHEN session_traffic_source_last_click.google_ads_campaign.customer_id IS NOT NULL THEN 'google-ads'
WHEN session_traffic_source_last_click.manual_campaign.source_platform IS NULL THEN 'manual'
ELSE session_traffic_source_last_click.manual_campaign.source_platform
END
AS session_source_platform,
IFNULL(session_traffic_source_last_click.manual_campaign.creative_format, '(not set)') AS session_creative_format,
IFNULL(session_traffic_source_last_click.manual_campaign.marketing_tactic, '(not set)') AS session_marketing_tactic,
IFNULL(session_traffic_source_last_click.google_ads_campaign.customer_id, '(not set)') AS session_customer_id,
IFNULL(session_traffic_source_last_click.google_ads_campaign.account_name, '(not set)') AS session_account_name,
CAST(session_engaged AS STRING) AS session_engaged
FROM
traffic_sources_last_non_direct
You can modify the query to suit your specific needs.
Regards,
Sinan
Additionally, I’ve found that the tool available at GA4 SQL Tool can be a good starting point, although it doesn’t always provide 100% accurate results. You may want to double-check the queries generated by this tool and adjust them based on your specific setup.
Hi @Kelsiehall
Solution Approach:
While "source platform" isn't directly available as a specific field in the BigQuery export schema for GA4, you can derive it using related dimensions and fields from your exported data.
Check for Similar Dimensions: GA4 sends detailed event data to BigQuery, and some fields that could help identify the "source platform" include:
Use the device Table for Platform Information:
Create a Derived Field: You can use SQL to create a derived field that approximates the "source platform." For example:
This query looks at the operating system and device category to approximate whether the user came from a mobile app or web platform.
Analyze UTM Parameters (if applicable): If you are tagging your campaigns with UTM parameters, you can use the traffic_source fields in conjunction with UTM parameters to get a clearer understanding of the platform. For instance:
If this approach doesn't fully meet your needs, consider exploring tools like Windsor.ai, which could offer deeper integrations for data transformation and enrichment. Windsor.ai allows you to combine multiple sources of data into a more refined structure, which might help in extracting and organizing the "source platform" information more seamlessly.
Hope this helps!