Had a problem similar to that a few other GCP users reported (e.g. here and here) where following Google's instructions on how to install the Ops Agent on a fresh plain vanilla CentOS 7 e2-micro instance wasn't working. The Ops Agent would be installed and active yet no metrics would show up.
The fix was to enable the "Stackdriver Monitoring API" for the project (not the "Cloud Monitoring API" - which was already enabled). Once enabled, the metrics started flowing.
From what I can tell, the former isn't enabled by default and Google's cloud shell installation script isn't checking if it's enabled. This means that my problem is likely reproducible on any account with default API configuration where that API is disabled.
Steps I've done (several times on fresh VMs):
(Do the above steps work correctly for everyone else on a fresh project or GCP account?)
Output of sudo bash diagnose-agents.sh:
[usera@dev-centos7 ~]$ curl -sSO https://dl.google.com/cloudagents/diagnose-agents.sh
[usera@dev-centos7 ~]$ sudo bash diagnose-agents.sh
--------------------------------------------------------------------
Starting agent diagnostic script. This script will gather information
about your VM and any Google Cloud Agents running on it. All output will
be found in the directory printed below. Please redact any sensitive
information from the copied configs and logs before sending to Support.
--------------------------------------------------------------------
OUTPUT DIRECTORY: /var/tmp/google-agents/20221126
Checking connectivity to Google APIs
Checking for service account permissions
Checking Ops Agent
- Copying subagent logs from /var/log/google-cloud-ops-agent/subagents
- Copying Ops Agent system logs
- Copying config from /etc/google-cloud-ops-agent/config.yaml
Checking legacy Logging Agent (Ops Agent is preferred)
- Logging Agent is not installed.
Checking legacy Monitoring Agent (Ops Agent is preferred)
- Monitoring Agent is not installed.
I checked logs looking for errors and warnings, followed directions in "Troubleshoot the Ops Agent". There were a few errors in CentOS logs but all seemed unrelated to the issue.
Then took a look at /var/tmp/google-agents/20221126/agent-info.txt collected by the above diagnostic script, and voila! A few hundred of giant log events with no severity asking to visit a specific URL and enable the API.
Exporting failed. Try enabling retry_on_failure config option to retry on retryable errors {"error": "failed to export time series to GCM: rpc error: code = PermissionDenied desc = Cloud Monitoring API has not been used in project 10021 before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/monitoring.googleapis.com/overview?project=10021 then retry.
Sample events:
Nov 26 19:43:11 dev-centos7 otelopscol[948]: /root/go/pkg/mod/go.opentelemetry.io/collector@v0.61.0/exporter/exporterhelper/internal/bounded_memory_queue.go:61
Nov 26 19:43:34 dev-centos7 otelopscol[948]: 2022-11-26T19:43:34.700Z error exporterhelper/queued_retry.go:361 Exporting failed. Try enabling retry_on_failure config option to retry on retryable errors {"error": "failed to export time series to GCM: rpc error: code = PermissionDenied desc = Cloud Monitoring API has not been used in project 10021 before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/monitoring.googleapis.com/overview?project=10021 then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry.\nerror details: name = ErrorInfo reason = SERVICE_DISABLED domain = googleapis.com metadata = map[consumer:projects/10021 service:monitoring.googleapis.com]\nerror details: name = Help desc = Google developers console API activation url = https://console.developers.google.com/apis/api/monitoring.googleapis.com/overview?project=10021; failed to export time series to GCM: rpc error: code = PermissionDenied desc = Cloud Monitoring API has not been used in project 10021 before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/monitoring.googleapis.com/overview?project=10021 then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry.\nerror details: name = ErrorInfo reason = SERVICE_DISABLED domain = googleapis.com metadata = map[consumer:projects/10021 service:monitoring.googleapis.com]\nerror details: name = Help desc = Google developers console API activation url = https://console.developers.google.com/apis/api/monitoring.googleapis.com/overview?project=10021", "errorCauses": [{"error": "failed to export time series to GCM: rpc error: code = PermissionDenied desc = Cloud Monitoring API has not been used in project 10021 before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/monitoring.googleapis.com/overview?project=10021 then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry.\nerror details: name = ErrorInfo reason = SERVICE_DISABLED domain = googleapis.com metadata = map[consumer:projects/10021 service:monitoring.googleapis.com]\nerror details: name = Help desc = Google developers console API activation url = https://console.developers.google.com/apis/api/monitoring.googleapis.com/overview?project=10021"}, {"error": "failed to export time series to GCM: rpc error: code = PermissionDenied desc = Cloud Monitoring API has not been used in project 10021 before or it is disabled. Enable it by visiting https://console.developers.google.com/apis/api/monitoring.googleapis.com/overview?project=10021 then retry. If you enabled this API recently, wait a few minutes for the action to propagate to our systems and retry.\nerror details: name = ErrorInfo reason = SERVICE_DISABLED domain = googleapis.com metadata = map[consumer:projects/10021
Hope this helps someone!
Hi,
but some time ago I've had issue with OpsAgent also. I've had old, legacy agents installed and tried to install OpsAgent on some I believe RedHat machine. I've installed agent without errors, but Monitoring dashboard showed that agent is not installed. Root cause was simple as f.. Old legacy agents didn't allowed to communicate the new one. After uninstalling the old ones, OpsAgent showed his presence on Monitoring dashboard. Perhaps it will help somebody too 🙂
best,
DamianS
If u are not using default service account, please validate the permission of service account. Montioring and logging permission to service account can fix this issue. also make sure service account is not disabled.