Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

BQ not showing correct data from GSC connection

I've got 2 issues right now that I'm hoping I can get some support for, both relating to Big Query integration with GSC.

First of all, I tried to update the schema for the search site table and apparently screwed it up so it won't export anymore.

Edited Schema to try and add pageEdited Schema to try and add page

The reason I did that is because I'm trying to get a report showing all pages on my site and which queries were used to search for that page. However, when I run the report, I only get http domain data (http://www.domain.com)  and nothing else. I then realized that it's not showing the https domain (https://www.domain.com

How would I go about correcting it so that it shows all pages? Is it something I need to fix with the GSC connection?

 

Solved Solved
1 3 1,864
1 ACCEPTED SOLUTION

I'm glad to hear that you've resolved the schema issue and are now seeing HTTPS data. If you're only seeing the root domain (e.g., domain.com) and not individual pages (e.g., domain.com/aboutus), there are a few potential reasons and solutions:

  • Check Your GSC Data: Before diving into BigQuery, ensure that Google Search Console itself is showing data for individual pages. If GSC isn't showing this data, then it won't be in BigQuery either.

  • Query Modification: Ensure that your BigQuery SQL query is designed to retrieve individual page data. The column that typically holds this data is named something like page or url. Your query should select this column to see individual page data.

SELECT page, COUNT(*) as total_views
FROM `your_dataset.your_table`
GROUP BY page
ORDER BY total_views DESC;

  • Data Granularity: Ensure that the data you're exporting from GSC to BigQuery is at the right granularity. GSC can provide both site-level and page-level data. Ensure that you're exporting page-level data to BigQuery.

  • Filters in GSC: If you've set up any filters in GSC, ensure they aren't excluding individual page data. Filters can be set to include/exclude specific URLs or URL patterns.

  • Data Aggregation: If you're using any aggregation in your queries, ensure that you're not unintentionally aggregating page-level data into domain-level summaries.

  • Check Schema Again: Double-check the schema of your BigQuery table to ensure that there's a field/column that should contain individual page URLs. If this column is missing or not populated, it could be the reason you're not seeing individual pages.

  • Reconnect GSC: If you've recently made changes to your GSC settings or properties, consider reconnecting GSC to BigQuery to ensure that the latest configuration and data are being exported.

View solution in original post

3 REPLIES 3

It sounds like you're facing two main issues:

  1. The schema for the search site table in BigQuery has been modified and is causing export issues.
  2. Your report is only showing data for the HTTP version of your domain and not the HTTPS version.

Here is how to address these issues step by step:

1. Fixing the Schema Issue:

  • Backup Data: Before making any changes, ensure you have a backup of your existing data in BigQuery. BigQuery charges for storage, so while backups are essential, it's also a good practice to periodically review and delete old backups that are no longer needed to manage costs.
  • Revert to Original Schema: If you have a backup of the original schema or can recall it, revert the table schema to its original state. This will ensure that the table is compatible with GSC exports. Remember, when you revert the schema to its original state, you will lose any changes that you made to it. Ensure you have a backup of the changes before reverting.
  • Reconnect GSC: Once the schema is reverted, try reconnecting GSC to BigQuery and initiate a data export. Check if the export is successful.

2. Including HTTPS Data in the Report:

  • Verify Both Properties in GSC: Ensure that both the HTTP and HTTPS versions of your domain are verified in Google Search Console.
  • Connect Both Properties to BigQuery: If both versions are verified, ensure that both are connected to BigQuery for data export. You can choose to set up a separate table for each property or merge the data into a single table. The best option will depend on your specific needs.
  • Modify Your Query: When running your report in BigQuery, ensure that your SQL query includes data from both the HTTP and HTTPS tables (if they are separate). You can use the JOIN clause to combine data from the HTTP and HTTPS tables if they have a common key, like a date or page URL. Alternatively, if the tables have the same schema, you can use the UNION ALL clause to stack the data from both tables.
  • Consolidate in GSC (Optional): To simplify things moving forward, consider setting up a Domain property in GSC. This property aggregates data across all subdomains, protocols, and paths, giving you a holistic view of your site's search data. However, if you've been tracking HTTP and HTTPS separately for a long time, switching to a Domain property will mean you start collecting data afresh for this new property. Historical data will still reside in the separate HTTP and HTTPS properties. Additionally, while the Domain property provides an aggregated view, if you need granular data for specific subdomains, you'd still benefit from having separate properties for them in GSC.

Final Thoughts:

Always monitor the data after making changes to ensure everything is working as expected. Regularly check for any discrepancies or anomalies in the data, especially after schema changes or when merging data from multiple sources. If you're not comfortable making these changes or if the issues persist, consider reaching out to a Google Cloud consultant or specialist who can provide hands-on support.

Thanks for the response. I figured out how to fix the schema, and I'm now seeing Https data. However I'm not seeing individual pages (IE domain.com/aboutus or domain.com/careers), only domain.com

What do I need to do to be able to analyze the individual pages?

I'm glad to hear that you've resolved the schema issue and are now seeing HTTPS data. If you're only seeing the root domain (e.g., domain.com) and not individual pages (e.g., domain.com/aboutus), there are a few potential reasons and solutions:

  • Check Your GSC Data: Before diving into BigQuery, ensure that Google Search Console itself is showing data for individual pages. If GSC isn't showing this data, then it won't be in BigQuery either.

  • Query Modification: Ensure that your BigQuery SQL query is designed to retrieve individual page data. The column that typically holds this data is named something like page or url. Your query should select this column to see individual page data.

SELECT page, COUNT(*) as total_views
FROM `your_dataset.your_table`
GROUP BY page
ORDER BY total_views DESC;

  • Data Granularity: Ensure that the data you're exporting from GSC to BigQuery is at the right granularity. GSC can provide both site-level and page-level data. Ensure that you're exporting page-level data to BigQuery.

  • Filters in GSC: If you've set up any filters in GSC, ensure they aren't excluding individual page data. Filters can be set to include/exclude specific URLs or URL patterns.

  • Data Aggregation: If you're using any aggregation in your queries, ensure that you're not unintentionally aggregating page-level data into domain-level summaries.

  • Check Schema Again: Double-check the schema of your BigQuery table to ensure that there's a field/column that should contain individual page URLs. If this column is missing or not populated, it could be the reason you're not seeing individual pages.

  • Reconnect GSC: If you've recently made changes to your GSC settings or properties, consider reconnecting GSC to BigQuery to ensure that the latest configuration and data are being exported.