Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Bigquery Storage

One of the GCP project, GCS is utilized to store application logs that are subsequently utilized in BigQuery. We seek clarification on whether deleting logs from the bucket would impact BigQuery's ability to extract previously used data. Specifically, we are interested in understanding if BigQuery stores data in its own storage, allowing it to extract results for previously used reports, regardless of the files stored in GCS.

0 1 395
1 REPLY 1

The formal answer is "it depends" ... so let's try and break it apart.  Let's start by assuming that you have a data object in GCS.  Let's assume its a CSV file.  If you LOAD that data into a BigQuery table, then BigQuery will copy the data into BigQuery's own storage.  If you now delete the object in GCS, you can still query the table that was populated from the GCS data because there is a copy of it in BigQuery storage.

However ... loading the data into BigQuery is not the only thing you can do with GCS data.  Should you choose, you can create a BigQuery external table.  This creates a BQ table where the data is NOT copied into BigQuery storage but instead, the data of the table is the actual data in GCS.  When you query the table in BQ, BQ is scanning the data in GCS.  If you delete the GCS object, you will have removed the data for the table and will no longer be able to query the table.

And ... thus ... it would be wrong of me to simply say "No ... deleting the GCS object will not affect your ability to query the BQ table" .... while that is likely the configuration you have, it is possible that you may be using external tables.