Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Question about how BigQuery querying external table

Dear Sir / Madam

I got a question about how BigQuery querying external table. I do very appreciate your kind help.

As per my understanding, when we are querying external table (permanent and temporary) from GCS data files in BigQuery, it's similar to Hadoop Hive to create schema only and querying data on the data file directly. BigQuery doesn't  create an internal temp table and load data into the temp table and then just start to query.

Some of my colleagues told me that it can query data file directory in S3 bucket on AWS, but not GCP BigQuery. They said that when BigQuery querying external table from GCS, BigQuery create an internal table and load data into the temp table and then just start to query.

I am not sure which one is right and personally I disagree my colleagues' point due to that loading data from GCS to BigQuery takes time and network transferring. In addition, GCS and BigQuery are both with distribution storage. It's strange to transfer data between two distribution storage.

Thank you very much

Best regards,

0 3 1,577
3 REPLIES 3

You can query files in GCS Directly using BigQuery External table. You do not need to load data into the BQ.  Here is the link to documentation, that will help: https://cloud.google.com/bigquery/docs/external-tables

Dear daljeet-singh

Thank you very much for your reply. 

I did read the document and could not find information about creating internal temp table during querying external table (permanent and temporary). That's why I believe my colleagues misunderstand the meaning of temporary table. Temporary table does query files in GCS directly and doesn't do create an internal temporary table and load data into it. Unfortunately, I can't find GCP document mentioning the detail about how it works. 

 

Thank you again

Best regards,

Please address the "Federated queries" section here:
https://cloud.google.com/bigquery/docs/external-data-sources

and the general info on temporary and permanent tables of BigQuery: