Hello everyone,
we have a project setup, where multiple users from several regions are working on the same data set.
last months we have experienced a cost spike. i can track that down based on SKU on the invoice ( "BigQuery Storage API Network Internet Egress Intercontinental (Excl Oceania and China)" ). there were two days, where this process was happening.
my question is:
how can i track down, which user has produced that cost ?
really appreciate your support in advance!
best,
Michael
Howdy Michael,
My first thought is to use the INFORMATION_SCHEMA meta tables in BigQuery. For example see here. This will list all query jobs run in BigQuery. By running a selective query against this table, we can get a listing of all jobs during a given time frame. Now its time to slice and dice these. You could look at your detailed billing data and try to determine WHEN the egress charges occurred .. from there, you can drill down deeper into the queries. I'd look to correlate what queries are being executed and how.
hello @kolban
thank you for the recommendation. i will look into it. i have already determined the time frame, it was on 9th-10th of february. i will look from there.
Howdy Michael. I heard that the INFORMATION_SCHEMA data is captured for 180 days so we should still have lots of time. Please post back the outcome from your investigations. I'm keen to hear your experiences. Especially did you get the answers you sought and how easy/hard it was to get there.