The reason you're seeing different cost values and duplicate rows is due to how UNNEST(credits) is used in conjunction with your aggregation and joins.
When you use LEFT JOIN UNNEST(credits), each row in your original table can be duplicated multiple times if there are multiple credits associated with it. Since you're summing the cost
column after the join, this leads to inflated cost values.
The GROUP BY clause in your query is likely grouping rows that have different credit details (e.g., different credit_id
, credit_type
), but the same values for the other columns you're grouping on. This results in multiple rows per group, where only the credit details vary.
Here’s an updated version of your query that aggregate the credits before joining:
WITH aggregated_credits AS (
SELECT
billing_account_id,
service.id AS service_id,
sku.id AS sku_id,
FORMAT_DATETIME('%Y-%m-%d', usage_start_time) AS usage_date,
SUM(IFNULL(credits.amount, 0)) AS credit_amount
FROM `TABLENAME`
LEFT JOIN UNNEST(credits) AS credits
WHERE (cost != 0 OR credits.amount != 0) AND invoice.month = "202404"
GROUP BY
billing_account_id,
service_id,
sku_id,
usage_date
)
SELECT
main.billing_account_id,
main.service.id AS service_id,
main.service.description AS service_des,
main.sku.id AS sku_id,
main.sku.description AS sku_des,
FORMAT_DATETIME('%Y-%m-%d', main.usage_start_time) AS usage_date,
main.location.location AS location,
main.location.region AS region,
main.location.country AS country,
main.location.zone AS zone,
main.invoice.month AS invoice_mon,
main.currency,
AVG(main.currency_conversion_rate) as currency_conversion_rate,
main.cost_type,
main.resource.name AS resource_name,
main.resource.global_name AS res_global_name,
SUM(main.usage.amount_in_pricing_units) AS usage_amount_pricing_units,
main.usage.pricing_unit AS pricing_unit,
SUM(main.cost) AS cost,
SUM(main.cost_at_list) AS cost_at_list,
main.transaction_type,
main.seller_name,
main.adjustment_info.id AS adjustment_info_id,
main.adjustment_info.description AS adjustment_des,
main.adjustment_info.type AS adjustment_type,
main.adjustment_info.mode AS adjustment_mode,
SUM(main.price.effective_price) AS effective_price,
SUM(main.price.pricing_unit_quantity) AS pricing_unit_quantity,
main.project.id AS project_id,
main.project.number AS project_number,
main.project.name AS project_name,
IFNULL(credits.credit_amount, 0) AS credit_amount,
TO_JSON_STRING(main.labels) AS labels,
TO_JSON_STRING(main.system_labels) AS system_labels
FROM `TABLENAME` AS main
LEFT JOIN aggregated_credits AS credits
ON main.billing_account_id = credits.billing_account_id
AND main.service.id = credits.service_id
AND main.sku.id = credits.sku_id
AND FORMAT_DATETIME('%Y-%m-%d', main.usage_start_time) = credits.usage_date
WHERE (main.cost != 0 OR IFNULL(credits.credit_amount, 0) != 0)
AND main.invoice.month = "202404"
GROUP BY
main.billing_account_id,
main.service_id,
main.service_des,
main.sku_id,
main.sku_des,
usage_date,
location,
region,
country,
zone,
invoice_mon,
currency,
cost_type,
resource_name,
res_global_name,
pricing_unit,
transaction_type,
seller_name,
adjustment_info_id,
adjustment_des,
adjustment_type,
adjustment_mode,
project_id,
project_number,
project_name,
labels,
system_labels,
credit_amount
HAVING (SUM(main.cost) != 0 OR IFNULL(credits.credit_amount, 0) != 0)
ORDER BY usage_date;
aggregated_credits
subquery.credit_amount
is already aggregated, you won't have multiple rows with different credit details for the same group.Additional Considerations:
credit_id
, credit_type
) for reporting purposes, you can include them in the aggregation and GROUP BY clause in the aggregated_credits
subquery.Key Improvements:
Thanks for the response.
I tried the above query. with this I am getting correct cost value but credit amount is higher. Actual credit amount (-9973) but I got (-24660).
Also In the above query credits.id, credits.type, credits.name, credits.full_name are missing. I need those details to map the resource with applied credit details. so in this query what's the change required to achieve that?
To address the issue of inflated credit amounts and include the necessary credit details (such as credits.id
, credits.type
, credits.name
, and credits.full_name
), you can modify the aggregation and join strategy to ensure correct handling of the credits data.
Here's an updated approach to aggregate the credits correctly and include the necessary credit details:
Here's the updated SQL query:
WITH aggregated_credits AS (
SELECT
billing_account_id,
service.id AS service_id,
sku.id AS sku_id,
FORMAT_DATETIME('%Y-%m-%d', usage_start_time) AS usage_date,
ARRAY_AGG(STRUCT(
credits.amount AS credit_amount,
credits.id AS credit_id,
credits.type AS credit_type,
credits.name AS credit_name,
credits.full_name AS credit_full_name
)) AS credits_array
FROM `TABLENAME`
LEFT JOIN UNNEST(credits) AS credits
WHERE (cost != 0 OR credits.amount != 0) AND invoice.month = "202404"
GROUP BY
billing_account_id,
service_id,
sku_id,
usage_date
)
SELECT
main.billing_account_id,
main.service.id AS service_id,
main.service.description AS service_des,
main.sku.id AS sku_id,
main.sku.description AS sku_des,
FORMAT_DATETIME('%Y-%m-%d', main.usage_start_time) AS usage_date,
main.location.location AS location,
main.location.region AS region,
main.location.country AS country,
main.location.zone AS zone,
main.invoice.month AS invoice_mon,
main.currency,
AVG(main.currency_conversion_rate) as currency_conversion_rate,
main.cost_type,
main.resource.name AS resource_name,
main.resource.global_name AS res_global_name,
SUM(main.usage.amount_in_pricing_units) AS usage_amount_pricing_units,
main.usage.pricing_unit AS pricing_unit,
SUM(main.cost) AS cost,
SUM(main.cost_at_list) AS cost_at_list,
main.transaction_type,
main.seller_name,
main.adjustment_info.id AS adjustment_info_id,
main.adjustment_info.description AS adjustment_des,
main.adjustment_info.type AS adjustment_type,
main.adjustment_info.mode AS adjustment_mode,
SUM(main.price.effective_price) AS effective_price,
SUM(main.price.pricing_unit_quantity) AS pricing_unit_quantity,
main.project.id AS project_id,
main.project.number AS project_number,
main.project.name AS project_name,
IFNULL(SUM(credits.credit_amount), 0) AS credit_amount,
credits.credit_type AS credit_type,
credits.credit_id AS credit_id,
credits.credit_name AS credit_name,
credits.credit_full_name AS credit_full_name,
TO_JSON_STRING(main.labels) AS labels,
TO_JSON_STRING(main.system_labels) AS system_labels
FROM `TABLENAME` AS main
LEFT JOIN aggregated_credits AS credits
ON main.billing_account_id = credits.billing_account_id
AND main.service.id = credits.service_id
AND main.sku.id = credits.sku_id
AND FORMAT_DATETIME('%Y-%m-%d', main.usage_start_time) = credits.usage_date
WHERE (main.cost != 0 OR IFNULL(SUM(credits.credit_amount), 0) != 0)
AND main.invoice.month = "202404"
GROUP BY
main.billing_account_id,
main.service_id,
main.service_des,
main.sku_id,
main.sku_des,
usage_date,
location,
region,
country,
zone,
invoice_mon,
currency,
cost_type,
resource_name,
res_global_name,
pricing_unit,
transaction_type,
seller_name,
adjustment_info_id,
adjustment_des,
adjustment_type,
adjustment_mode,
project_id,
project_number,
project_name,
credit_type,
credit_id,
credit_name,
credit_full_name,
labels,
system_labels
HAVING (SUM(main.cost) != 0 OR IFNULL(SUM(credits.credit_amount), 0) != 0)
ORDER BY usage_date;
aggregated_credits
CTE now includes an ARRAY_AGG
to aggregate the credit details into an array of structs. This ensures we collect all relevant credit information in a single step.aggregated_credits
CTE and then use SUM
to aggregate the credit_amount
. Additionally, access individual credit details using the array elements.GROUP BY
clause includes all necessary columns to prevent duplicate rows.HAVING
clause to filter out rows with zero costs or credits.Getting error for this query :
Are these AI-generated queries? FYI: i tried all of those before posting it in the community