Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

BigQuery - Source Platform GA4 Dimension

I am trying to figure out how to get the GA4 dimension, "source platform" in my Big Query exports. Upon my research it seems that source platform may not have a direct corresponding field in BigQuery export schema, and want to know how to derive it. 

 

0 5 838
5 REPLIES 5

Hi,

In the GA4 accounts I manage, I’ve observed that the "source platform" dimension is populated only with Google Ads and Manual values. I recommend using the following query to accurately derive this dimension:

CASE
WHEN session_traffic_source_last_click.google_ads_campaign.customer_id IS NOT NULL THEN 'Google Ads'
WHEN session_traffic_source_last_click.manual_campaign.source_platform IS NULL THEN 'Manual'
ELSE session_traffic_source_last_click.manual_campaign.source_platform
END AS session_source_platform

Regards,
Sinan

Thanks Sinan for your response. We used this query and were able to get some data in.

Do you happen to know how to pull in Session Default Channel Group data? From my understanding we can't just get this directly into our reports like we used to with UA. And now have to derive it from other dimensions such as 

  • source
  • campaign
  • medium
  • source category (derived from source)
  • source platform (derived using the logic you posted)

Do you know of a quicker way to derive Session Default Channel Group?

If you're looking to pull traffic source data according to the Last Non-Direct Attribution model, you can use the following query:

 

WITH
  traffic_sources_last_non_direct AS (
  SELECT
    event_date,
    CONCAT( user_pseudo_id, '_', (
      SELECT
        COALESCE( value.int_value )
      FROM
        UNNEST (event_params)
      WHERE
        `key` = 'ga_session_id' ) ) AS session_id,
    IFNULL(MAX( (
        SELECT
          COALESCE( value.int_value, CAST(value.string_value AS INT64), CAST(value.double_value AS INT64), CAST(value.float_value AS INT64) )
        FROM
          UNNEST (event_params)
        WHERE
          `key` = 'session_engaged' ) ), 0) AS session_engaged,
    stream_id,
    ARRAY_AGG(session_traffic_source_last_click
    ORDER BY
      event_timestamp ASC
    LIMIT
      1) [SAFE_OFFSET(0)] AS session_traffic_source_last_click
  FROM
    `__PROJECT_ID_DATASET_ID_TABLE_ID__`
  GROUP BY
    event_date,
    session_id,
    stream_id )
SELECT
  PARSE_DATE('%Y%m%d', event_date) AS date,
  CAST(session_id AS STRING) AS session_id,
  CAST(stream_id AS STRING) AS stream_id,
  IFNULL(COALESCE(session_traffic_source_last_click.google_ads_campaign.campaign_id, session_traffic_source_last_click.manual_campaign.campaign_id), '(not set)') AS session_campaign_id,
  IFNULL(COALESCE(session_traffic_source_last_click.google_ads_campaign.campaign_name, session_traffic_source_last_click.manual_campaign.campaign_name), '(not set)') AS session_campaign_name,
  CASE
    WHEN session_traffic_source_last_click.google_ads_campaign.customer_id IS NOT NULL THEN 'google'
    WHEN session_traffic_source_last_click.manual_campaign.source = '(not set)' THEN NULL
    ELSE session_traffic_source_last_click.manual_campaign.source
END
  AS session_source,
  CASE
    WHEN session_traffic_source_last_click.google_ads_campaign.customer_id IS NOT NULL THEN 'cpc'
    WHEN session_traffic_source_last_click.manual_campaign.medium = '(not set)' THEN NULL
    ELSE session_traffic_source_last_click.manual_campaign.medium
END
  AS session_medium,
  IFNULL(session_traffic_source_last_click.manual_campaign.term, '(not set)') AS session_term,
  IFNULL(session_traffic_source_last_click.manual_campaign.content, '(not set)') AS session_content,
  CASE
    WHEN session_traffic_source_last_click.google_ads_campaign.customer_id IS NOT NULL THEN 'google-ads'
    WHEN session_traffic_source_last_click.manual_campaign.source_platform IS NULL THEN 'manual'
    ELSE session_traffic_source_last_click.manual_campaign.source_platform
END
  AS session_source_platform,
  IFNULL(session_traffic_source_last_click.manual_campaign.creative_format, '(not set)') AS session_creative_format,
  IFNULL(session_traffic_source_last_click.manual_campaign.marketing_tactic, '(not set)') AS session_marketing_tactic,
  IFNULL(session_traffic_source_last_click.google_ads_campaign.customer_id, '(not set)') AS session_customer_id,
  IFNULL(session_traffic_source_last_click.google_ads_campaign.account_name, '(not set)') AS session_account_name,
  CAST(session_engaged AS STRING) AS session_engaged
FROM
  traffic_sources_last_non_direct

You can modify the query to suit your specific needs.

Regards,
Sinan

Additionally, I’ve found that the tool available at GA4 SQL Tool can be a good starting point, although it doesn’t always provide 100% accurate results. You may want to double-check the queries generated by this tool and adjust them based on your specific setup.

Hi @Kelsiehall 

Solution Approach:

While "source platform" isn't directly available as a specific field in the BigQuery export schema for GA4, you can derive it using related dimensions and fields from your exported data.

Step-by-Step Process to Derive "Source Platform":

  1. Check for Similar Dimensions: GA4 sends detailed event data to BigQuery, and some fields that could help identify the "source platform" include:

    • traffic_source.name: Represents the name of the source that sent the traffic (e.g., "google").
    • traffic_source.medium: Refers to the type of traffic, such as "organic" or "paid".
    • traffic_source.source: Identifies the source, which could indicate the platform in some cases.
  2. Use the device Table for Platform Information:

    • The device table within BigQuery includes device-level data that can give hints about the platform:
      • device.operating_system: Gives the operating system of the user, such as "iOS", "Android", or "Windows".
      • device.category: Differentiates between "mobile", "desktop", or "tablet".
    • Combining these device metrics with the traffic_source fields can help narrow down the platform source.
  3. Create a Derived Field: You can use SQL to create a derived field that approximates the "source platform." For example:

     
    SELECT traffic_source.name, traffic_source.medium, device.category, device.operating_system, CASE WHEN device.operating_system = "iOS" OR device.operating_system = "Android" THEN "Mobile App" WHEN device.category = "desktop" THEN "Web" ELSE "Unknown" END AS source_platform FROM `your-project.your-dataset.events_*`

    This query looks at the operating system and device category to approximate whether the user came from a mobile app or web platform.

  4. Analyze UTM Parameters (if applicable): If you are tagging your campaigns with UTM parameters, you can use the traffic_source fields in conjunction with UTM parameters to get a clearer understanding of the platform. For instance:

    • utm_source: Could give clues about whether the traffic came from a specific platform or app.
    • utm_medium: Could also help identify whether it's app-based traffic or web-based traffic.

Additional Considerations:

If this approach doesn't fully meet your needs, consider exploring tools like Windsor.ai, which could offer deeper integrations for data transformation and enrichment. Windsor.ai allows you to combine multiple sources of data into a more refined structure, which might help in extracting and organizing the "source platform" information more seamlessly.

Hope this helps!