Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

pageview event in BQ export & GA4

Hello,

I work in a company supporting data science initiatives. Yesterday, I aggregated the page views (PVs) for the page path and screen class /subscription/setting-completed in GA4 for the period from July 1st to July 31st, and confirmed that the unique user count was 24,000.

At the same time, I extracted the number of user_pseudo_ids for the page_view event in BigQuery for the same period (July 1st to July 31st) using the page_location key corresponding to the above page path. However, the number was only 14,000, which is significantly lower than the user count in GA4. (Since we haven't implemented user_id yet, I used user_pseudo_id for the extraction.)

Here is the query I used to extract the data (I have omitted the dataset name due to company confidentiality).

chihirouchida_0-1724163515964.png

 

I understand that the number of user_pseudo_ids is based on the devices and browsers used by the service users. However, if that’s the case, the number should be higher than the one in GA4.

I also came across a reference suggesting that the GA4 user count is based on active users. If that were true, I would expect the GA4 number to be smaller.

I'm struggling to understand why this discrepancy occurred due to my limited knowledge. If anyone with experience in this area could lend a hand, I would greatly appreciate it.

For context, the site is built using a React-based environment.

0 2 448
2 REPLIES 2

Hi @chihirouchida 

This could be the common issue when comparing user counts between GA4 and bigquery: 

Here are some possible reason:

  • User count in GA4 and BigQuery, GA4 reflects “active users” it means it counts users who had at least one session during specified time frame, and if they visit your page multiple times, it will be counted as one user.
  • In BigQuery using Pseudo IDs, using user_psuedo_ids can be counting each unique identifier based on the page views in a specific time. Also page_view can lead to lower count if the user did not trigger this function.
  • Ensure that your tracking implementation is properly configured, as page views may not be recorded as intended due to issues such as users navigating away from the page before the event triggers, page view events not being sent during specific interactions, or certain events being filtered out in BigQuery.
  • GA4 employs a session-based model that counts users across various sessions and devices. Consequently, if a user accesses your site from multiple devices or browsers, they may be counted multiple times in BigQuery, while GA4 will register them only once unless user IDs are implemented.
  • Also, If users are quickly leaving the page or if your application exhibits specific behaviors, such as being a single-page application that does not consistently trigger page_view events, it can impact your counts.

Here is how to set up BigQuery Export and you can also reach out to, Google Analytics support for more assistance.

I hope the above information is helpful.

Hi @chihirouchida this is an interesting case, and I get how discrepancies between GA4 and BigQuery data can be confusing. Some of these ideas might help:

1. Differences in Metrics

  • GA4's "Active Users": This metric is based on engagement signals like session starts, app focus, or activity during a session. It doesn’t directly rely on user_pseudo_id but instead focuses on behavioral patterns.
  • BigQuery's user_pseudo_id: This represents unique device or browser identifiers. Since it’s cookie- or device-based, the count might be lower if users clear cookies, switch devices, or block tracking.

2. Sampling or Missing Data in BigQuery

  • If BigQuery isn’t capturing every page_view event due to quotas, sampling, or export issues, you could see lower numbers compared to GA4.
  • Check the "rows processed" in BigQuery to make sure all the data is there. Also, confirm that the GA4 export to BigQuery is complete for the selected time range.

3. Filter Differences

  • Double-check filters applied in GA4 versus your BigQuery query. For example:
    • In GA4, the metric might include users who interacted with the /subscription/setting-completed path indirectly (e.g., across multiple sessions or devices).
    • Your BigQuery query might only be pulling direct page_view events linked to user_pseudo_id.

4. Single-Page Applications (React)

  • If you’re working with a React-based application (or any single-page app), page views might be tracked differently. Make sure your GA4 implementation is properly set up to handle virtual page views, either with gtag.js or Google Tag Manager. Any tracking issues here could explain data mismatches.

5. Exploring Other Tools

  • If the discrepancies persist, consider using a third-party tool like Windsor.ai. They offer direct GA4 integrations and can help you create unified views across systems, making it easier to compare data and spot inconsistencies.

Next Steps

  • Verify that all data is being fully exported to BigQuery.
  • Check if any filters or exclusions in GA4 might be affecting the user count.
  • Try adding breakdowns like session counts to your query for deeper insights into user behavior.

Hope this can be helpful!

Top Labels in this Space