Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Same query, different costs [BigQuery]

I have a table with daily partitions by the triggered_at (timestamp) column. When writing a query in the Google cloud console, it is possible to see the costs of processing.

Today I noticed such an anomaly. I wrote a query using where clause on the triggered_atfield, the cost was 4.49 GB. Then I accidentally added whitespace character to my query and the cost decreased = 163.63 MB. After that I could repeat the action and the cost constantly changed. For the sake of interest, I tried two more ways of using where clause and the cost dropped dramatically and did not change anymore (14.27 and 11.85 MB).

first_attempt_1.jpg

first_attempt_2.jpg

second_attempt.jpg

third_attempt.jpg

I tried to execute the first version of the query a couple of times. As a result, on job information tab I could see bytes processed/bytes billed, and they were:

first attempt = 3.35 GB 
second attempt = 3.37 GB
third attempt = 1.31 GB (maybe already cached)

Actually there are two questions:

  1. why adding just whitespace character affects costs?

  2. why is there such a serious difference between options:

date(triggered_at, 'Europe/Moscow') = date'2023-10-01'
triggered_at = timestamp'2023-10-01 00:00:00+0300'
triggered_at = '2023-10-01'

I assume that my table is partitioned by UTC time and in case of switching to the Moscow timezone the BigQuery optimizer checks more than one partition. But why do the costs not change in the second case?

0 1 1,909
1 REPLY 1