Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

GSC Data Exports: BigQuery Connector vs. GSC API

Hi all & @ms4446 who likely can help here 🙂 

I recently compared a data export from the GSC API to the data exported via the direct BigQuery connector and saw discrepancies.

 I was wondering if this is due to the same reasons described in this post: https://www.googlecloudcommunity.com/gc/Data-Analytics/Discrepancy-between-BQ-data-from-GSC-and-data...

I read that the sampling in the GSC API can be partially circumvented to get more data by using multiple properties: https://similar.ai/blog/closing-google-search-console-sampling-gap/

Can we assume that the BigQuery connector does give the full picture of all the data that would be 'sampled out' by the API connection? 

Thanks already for any insights

0 2 893
2 REPLIES 2

Hi @joma ,

Indeed, the link you referenced sheds light on potential reasons for discrepancies between BigQuery and the GSC UI, which likely account for the differences you've noticed. Here's a overview:

  • Sampling: The GSC UI may resort to sampling for large datasets to enhance performance, resulting in estimated data. Conversely, BigQuery is engineered for comprehensive analysis and typically bypasses sampling, offering a fuller dataset.

  • Aggregation Differences: Variations in how data is aggregated (e.g., daily vs. weekly) between the GSC UI and BigQuery can lead to discrepancies in data presentation.

  • Filter Inconsistencies: To ensure accurate comparisons, it's crucial to apply identical filters (such as date range, country, etc.) across both platforms.

  • Data Freshness: There might be a slight lag in data availability in BigQuery compared to the real-time data in the GSC UI.

  • Data Granularity: BigQuery's tendency to provide more detailed data can influence the perception of trends differently than the summarized data in the GSC UI.

Overall, the BigQuery connector typically offers a more comprehensive and unsampled view of your search data. However, for meaningful comparisons, aligning filters, understanding aggregation differences, and considering potential delays in data freshness are essential steps.

 

Hi @joma yes, discrepancies between the Google Search Console (GSC) API and the BigQuery connector are quite common due to differences in data aggregation and sampling methods.

Key Differences:

  • GSC API Sampling

    • The GSC API may apply sampling, especially for large datasets.
    • As you mentioned, using multiple properties can help minimize the impact of sampling.
  • BigQuery Connector

    • The BigQuery export typically provides a more complete dataset since it pulls data directly from Google’s backend, avoiding some API limitations.
    • However, discrepancies can still occur due to differences in update frequency and how the data is processed before being stored in BigQuery.

Alternative Approach

If you’re looking for a more efficient way to consolidate GSC data with other marketing sources like Google Ads, GA4, or LinkedIn Ads, Windsor.ai offers an integrated connector that simplifies data unification in BigQuery, Power BI, or Google Sheets, reducing the need for manual adjustments

Hope this helps