We seem to be incurring high Cloud Logging costs for one of our clients and their Infra spans a wide range of GCP services - GKE, Cloud Run Functions, Cloud Run, Firebase. It is hard to understand which of the logs are contributing to the costs. There are no destination sinks setup in our case
I totally understand the Cloud Logging architecture with exclusion and inclusion filters etc and also the GCP Audit log types but the challenges I am facing are mentioned below.
Solved! Go to Solution.
Dear dheerajpanyam,
I am 100% sure that the cause is the noise from too many logs written to the default bucket. The retaining period =< 30 days are included per this documentation. Even though you set the retaining period 20 days, the cost would still be the same https://cloud.google.com/stackdriver/pricing You only need to control how many GBs of logs ingested into a particular bucket.
As I previously stated, even though the GKE whether it is autopilot or standard stdout all logs but if your ingestion settings only include specific/granular parameter. You will not get charged for the non-ingested logs. You will only get charged for the logs that are included in the specific/granular parameter (except the Networking Telemetry I mentioned before ofc)
For the query, I'm not quite sure whether it is MQL or not. I think Google have it's own language. Please refer to https://cloud.google.com/logging/docs/view/logging-query-language
For the monitoring, is something like this sufficient for you?
To be able accurately track the ingestion, I suggest to add more filter to this monitoring such as filter by label. But unfortunately I don't see any way to label a Logging Bucket. Therefore I recommend you to create a new Cloud Storage Bucket then label it with appropriate key:value. After you finished creating the bucket, you then create a new sink with destination to that bucket that you just created.
Fyi, this is what I usually do. I disabled the default Sink, and created so many customs sinks so that I can control my logging cost more flexible. When I need something I just enable certain sinks, disable it again once I got what I need.
Hope this could give you some inspiration in your own GCP projects
Regards,
Iza
Dear dheerajpanyam,
Setup your Cloud Sink in here
https://console.cloud.google.com/logs/router
By default the Cloud Logging ingest everything, try to tune this parameter.
You can also disable the default sink, and create a new custom sink there.
To know what parameter to insert (include/exclude part), try explore it in the Logs Explorer. For example if I only want my Sink to ingest GKE logs only.
Most of the time, GKE and Cloud Run are the most Logs hungry services. As it stdout your application logs into the GCP Logging.
To answer no 2. Based on my experiences, the only thing that you need to do is to control the Sink what to include and to exclude. Even though the VPC Flow Logs are enabled, but the Logs are not ingested. You will not get charged for the Logs storage, but yes you still get charged for the Networking Telemetry. To know what services that you already enabled, I suggest you check the SKUs in your billing report.
From my POV, most of the time if someone asked me how to optimize logging cost. I would say to them to disable all logs except Audit Logs and any other logs that you want to keep. For the rest of the logs you can turn it ON only when there is an investigation/troubleshoot that needs to be done.
Regards,
Iza
Thanks @azzi. We have 3 GKE clusters - 2 autopilot and 1 standard. Apparently i cannot disable system and event log for autopilot. For the 1 standard cluster i disabled logging altogether. Also the challenge is in writing the Log query, does it use MQL? Is it possible to setup a metric chart that shows the volume of logs that is getting ingested and written to the bucket after applying the inclusion and exclusion filters. At present we use the _Default bucket with log retention of 20 days. So the question really is whether the costs are coming from storage / retention or the noise from too many logs written to the default bucket.
Dear dheerajpanyam,
I am 100% sure that the cause is the noise from too many logs written to the default bucket. The retaining period =< 30 days are included per this documentation. Even though you set the retaining period 20 days, the cost would still be the same https://cloud.google.com/stackdriver/pricing You only need to control how many GBs of logs ingested into a particular bucket.
As I previously stated, even though the GKE whether it is autopilot or standard stdout all logs but if your ingestion settings only include specific/granular parameter. You will not get charged for the non-ingested logs. You will only get charged for the logs that are included in the specific/granular parameter (except the Networking Telemetry I mentioned before ofc)
For the query, I'm not quite sure whether it is MQL or not. I think Google have it's own language. Please refer to https://cloud.google.com/logging/docs/view/logging-query-language
For the monitoring, is something like this sufficient for you?
To be able accurately track the ingestion, I suggest to add more filter to this monitoring such as filter by label. But unfortunately I don't see any way to label a Logging Bucket. Therefore I recommend you to create a new Cloud Storage Bucket then label it with appropriate key:value. After you finished creating the bucket, you then create a new sink with destination to that bucket that you just created.
Fyi, this is what I usually do. I disabled the default Sink, and created so many customs sinks so that I can control my logging cost more flexible. When I need something I just enable certain sinks, disable it again once I got what I need.
Hope this could give you some inspiration in your own GCP projects
Regards,
Iza