Hi team, I'm trying to use the following code to get the SlotMs of a BigQuery job
val slotMs = bigquery.getJob(JobId.of(bqJobId.jobId)).getStatistics[QueryStatistics].getTotalSlotMs
and calculate the cost based on the charge $12 per slot per month
val cost = (slotMs * 12).toFloat / (1000.0f * 60 * 60 * 24 * 30)
When testing it, I run a same query job twice. The first time it took more than 4 hours, and the second time it only took 10+ minutes to finish. It seems the second job uses the result in cache directly. But the slotMS and cost got by the code above are very similar(one is $7.3 and one is $7.7). It's very counter-intuitive since the second job takes much less time and should have much less cost. Can someone help me understand why? Thanks a lot in advance
Solved! Go to Solution.
Hi @songxxx,
Welcome to Google Cloud Community!
While BigQuery offers flat-rate pricing options, the default and most common billing method is on-demand pricing. This means that you are charged for the amount of data processed by your queries, not directly for the execution time or slots used. Yes, Slot usage is a factor in determining how quickly your query completes, of course more slots means faster processing, but still the cost is determined by the volume of data scanned, regardless of how many slots are allocated or the duration of the query.
The small difference in slotMs between your two runs suggests that the overhead of utilizing the cache was minimal compared to the actual data processing time of your initial query.
Note: Don't rely solely on slotMs to estimate BigQuery costs.
I hope the above information is helpful.
Hi @songxxx,
Welcome to Google Cloud Community!
While BigQuery offers flat-rate pricing options, the default and most common billing method is on-demand pricing. This means that you are charged for the amount of data processed by your queries, not directly for the execution time or slots used. Yes, Slot usage is a factor in determining how quickly your query completes, of course more slots means faster processing, but still the cost is determined by the volume of data scanned, regardless of how many slots are allocated or the duration of the query.
The small difference in slotMs between your two runs suggests that the overhead of utilizing the cache was minimal compared to the actual data processing time of your initial query.
Note: Don't rely solely on slotMs to estimate BigQuery costs.
I hope the above information is helpful.
Hi @songxxx
Thanks for the answer. I am calculating the cost on the basis of totalbytesprocessed which i get from Information_schema table but sometimes i get 0 as the totalbytesprocess and hugenumber in totalslotsms columns. Can you please help in understanding what should we do in this case? How can i calculate the cost when Totalslotsms is coming for longer queries.
Can you please/guide help on this ?