Hi, when we run the following queries in BigQuery, they both process the same amount of MBs, while the first one queries over a much shorter period of time:
The table is partitioned by month, on the `entered_at` column.
Why does the WHERE statement on `entered_at` has no effect on the number of MBs processed? As a result, our costs increase rapidly.
Solved! Go to Solution.
Based on some testing on a table partitioned by month, the whole month's data gets processed for any partial query.
E.g.
Range | Data Processed | Rows returned |
> tomorrow | 383KB | 0 |
> today | 383KB | 528 |
> this month | 383KB | 7,008 |
> 31st Aug | 1.6MB | 7,728 |
> 1st Aug | 1.6MB | 29,248 |
The bad news is that you'll get charged to query the whole data for a month even if it doesn't return any rows.
Without knowing your use case it's difficult to know what to suggest, but since the data is partitioned by month is it workable to only run reports once a month for the previous month's data?
Changing to a daily partition scheme would save money if you have to query more often, if that's workable.