How can I analyse expenses in BigQuery and how does Looker Studio works behind the scene?

Hello GCP community,

I use BigQuery as a data warehouse and Looker Studio as a reporting tool for showing dashboards. One of the dashboards is very complex, but I managed to make it using a custom SQL query and parameter field. 

It is unclear to me how all this work behind the scene. Mostly I am concerned with how much each dashboard is going to cost. Unfortunately, I couldn't find answers in the documentation (neither in BigQuery nor Looker Studio).

The question that I have are:

  1. How can I see how much each dashboard is going to cost? (e.g. for each dashboard I would like to see how much byte is being processed/billed)
  2. Is there a way to find out which jobs were created in BigQuery by 3rd party apps (like Looker Studio)? (My thinking is If I can know this information then I could see what query was executed and how frequently for each dashboard)
  3. How does Looker Studio fetch data from Big Query? How does that flow work? (Here I am most interested in this flow, and how this works under the hood)
    1. Does it fetch all data at once and cache somewhere it or does it fetch data on every change (control field)?
    2. If Looker Studio fetches all data at once, how does it work with custom parameters in a custom query? Is Looker copying the table, and then looker studio runs a query against the copied table, or is it some other way?

I would really appreciate answers or leads where I could learn this. 

Thank you for your help

Solved Solved
1 2 7,652
1 ACCEPTED SOLUTION

Hi @josipsha,

Welcome back to Google Cloud Community.

Here are a few responses to your queries:

  1. How can I see how much each dashboard is going to cost? (e.g. for each dashboard I would like to see how much byte is being processed/billed)

    - The Query History page in BigQuery allows you to view the amount of data processed and charged for each query. All of the queries that were executed on your project, including those that were started by other programs like Looker Studio, are listed on the Query History page. To see the cost connected with each inquiry, you can filter the results by user, date range, and other criteria.

    - Go to the BigQuery console, choose your project, and then click the "Query History" tab to visit the Query History page. There, you may view information about each query, such as the bytes processed, the length of the query, and cost.

  2. Is there a way to find out which jobs were created in BigQuery by 3rd party apps (like Looker Studio)? (My thinking is If I can know this information then I could see what query was executed and how frequently for each dashboard)

    - Go to the BigQuery console, choose your project, and then click the "Audit logs" option to examine the audit logs. To view the activity connected to third-party apps, you may then filter the results by user, date period, and other criteria.
  3. How does Looker Studio fetch data from Big Query? How does that flow work? (Here I am most interested in this flow, and how this works under the hood)

    - When you create a dashboard in Looker Studio, Looker Studio automatically creates a SQL query based on the visualization parameters and sends it to BigQuery via the API. Following the query's processing by BigQuery, Looker Studio uses those results to create the dashboard.

    Usually, the flow goes like this:
    -A dashboard in Looker Studio is opened by the user.
    -Through the API, Looker Studio sends a SQL query to BigQuery.
    -The query is processed by BigQuery, who then sends Looker Studio the results.
    -The results are used by Looker Studio to render the dashboard.
  1. Does it fetch all data at once and cache somewhere it or does it fetch data on every change (control field)?

    - Looker Studio does not cache or download all the data at once. Instead, as the user interacts with the dashboard, it fetches the information as needed. When a control field is changed by the user, Looker Studio uses the API to submit a fresh SQL query to BigQuery to retrieve the updated data. Every time the user makes a change to the dashboard, this process is repeated.

  2. How does Looker Studio handle custom parameters in a custom query if it receives all the data at once? Looker Studio executes a query against the duplicated table after Looker Studio copies the table, or does it work in another way?

    - Looker Studio does not download all the data at once, as was already described. To obtain the data on demand, it instead uses the API to submit SQL queries to BigQuery. Looker Studio incorporates the custom parameters in the SQL query that it sends to BigQuery when you use custom parameters in a custom query. Following the query's processing by BigQuery, Looker Studio receives the results and uses them to render the dashboard.

    For custom queries, Looker Studio does not duplicate the table or make a temporary table. Simply said, it sends BigQuery a SQL query that contains the custom parameters.

Here are some documentations that might help you:

BigQuery as a data warehouse
https://cloud.google.com/bigquery?_ga=2.144136053.-1150969476.1678208220#section-3

Looker Studio as a reporting tool 
https://cloud.google.com/looker-studio?_ga=2.179270500.-1150969476.1678208220#section-4
https://cloud.google.com/bigquery/docs/visualize-looker-studio?_ga=2.144136053.-1150969476.167820822...
https://cloud.google.com/bigquery/docs/bi-engine-looker-studio?_ga=2.144136053.-1150969476.167820822...
https://cloud.google.com/blog/products/data-analytics/looker-and-data-studio-integrate-for-best-of-b...





View solution in original post

2 REPLIES 2

Hi @josipsha,

Welcome back to Google Cloud Community.

Here are a few responses to your queries:

  1. How can I see how much each dashboard is going to cost? (e.g. for each dashboard I would like to see how much byte is being processed/billed)

    - The Query History page in BigQuery allows you to view the amount of data processed and charged for each query. All of the queries that were executed on your project, including those that were started by other programs like Looker Studio, are listed on the Query History page. To see the cost connected with each inquiry, you can filter the results by user, date range, and other criteria.

    - Go to the BigQuery console, choose your project, and then click the "Query History" tab to visit the Query History page. There, you may view information about each query, such as the bytes processed, the length of the query, and cost.

  2. Is there a way to find out which jobs were created in BigQuery by 3rd party apps (like Looker Studio)? (My thinking is If I can know this information then I could see what query was executed and how frequently for each dashboard)

    - Go to the BigQuery console, choose your project, and then click the "Audit logs" option to examine the audit logs. To view the activity connected to third-party apps, you may then filter the results by user, date period, and other criteria.
  3. How does Looker Studio fetch data from Big Query? How does that flow work? (Here I am most interested in this flow, and how this works under the hood)

    - When you create a dashboard in Looker Studio, Looker Studio automatically creates a SQL query based on the visualization parameters and sends it to BigQuery via the API. Following the query's processing by BigQuery, Looker Studio uses those results to create the dashboard.

    Usually, the flow goes like this:
    -A dashboard in Looker Studio is opened by the user.
    -Through the API, Looker Studio sends a SQL query to BigQuery.
    -The query is processed by BigQuery, who then sends Looker Studio the results.
    -The results are used by Looker Studio to render the dashboard.
  1. Does it fetch all data at once and cache somewhere it or does it fetch data on every change (control field)?

    - Looker Studio does not cache or download all the data at once. Instead, as the user interacts with the dashboard, it fetches the information as needed. When a control field is changed by the user, Looker Studio uses the API to submit a fresh SQL query to BigQuery to retrieve the updated data. Every time the user makes a change to the dashboard, this process is repeated.

  2. How does Looker Studio handle custom parameters in a custom query if it receives all the data at once? Looker Studio executes a query against the duplicated table after Looker Studio copies the table, or does it work in another way?

    - Looker Studio does not download all the data at once, as was already described. To obtain the data on demand, it instead uses the API to submit SQL queries to BigQuery. Looker Studio incorporates the custom parameters in the SQL query that it sends to BigQuery when you use custom parameters in a custom query. Following the query's processing by BigQuery, Looker Studio receives the results and uses them to render the dashboard.

    For custom queries, Looker Studio does not duplicate the table or make a temporary table. Simply said, it sends BigQuery a SQL query that contains the custom parameters.

Here are some documentations that might help you:

BigQuery as a data warehouse
https://cloud.google.com/bigquery?_ga=2.144136053.-1150969476.1678208220#section-3

Looker Studio as a reporting tool 
https://cloud.google.com/looker-studio?_ga=2.179270500.-1150969476.1678208220#section-4
https://cloud.google.com/bigquery/docs/visualize-looker-studio?_ga=2.144136053.-1150969476.167820822...
https://cloud.google.com/bigquery/docs/bi-engine-looker-studio?_ga=2.144136053.-1150969476.167820822...
https://cloud.google.com/blog/products/data-analytics/looker-and-data-studio-integrate-for-best-of-b...





Hi Aris,

I am following your guide on how to view the amount of data processed for data studio. I'm expecting that once I refresh my data on data studio, the associated queries in the dashboard will reflect in the query history, to check that I'm using the following query
```
SELECT
query,
creation_time
FROM `region-us`.INFORMATION_SCHEMA.JOBS_BY_PROJECT
WHERE
creation_time BETWEEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR) AND CURRENT_TIMESTAMP()
ORDER BY creation_time DESC
LIMIT 10;
```

I have tried different looker/data-studio projects but I can't get the associated queries to reflect on the history. Am I missing something?

Thanks for your help